Choosing Models

# identify models suitable for your data
julia> X, y = @load_iris
julia> models(matching(X, y))
54-element Vector
 (name = AdaBoostClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
 (name = BaggingClassifier, package_name = MLJScikitLearnInterface, ... )
 ⋮
# filter by docstring content:
julia> models("pca")
(name = PCA, package_name = MultivariateStats, ... )
(name = PCADetector, package_name = OutlierDetectionPython, ... )

# retrieve docs without code import:
julia> doc("PCA", pkg="MultivariateStats")

A Model Registry stores detailed metadata for over 200 models and documentation can be searched without loading model code.

Meta-algorithms

# choose a model and define hyperparameter ranges for tuning:
model = XGBoostRegressor()
r1 = range(model, :max_depth, lower=3, upper=10)
r2 = range(model, :gamma, lower=0, upper=10, scale=:log)

# wrap model to create self-tuning version:
tuned_model = TunedModel(model, range=[r1, r2], resampling=CV(), measure=l2)

# this both optimises and retrains on all data:
mach = machine(tuned_model, data) |> fit!

# predict using optimised params:
predict(mach, Xnew)

# inspect optimisation outcomes
report(mach).best_model # inspect optim. results
plot(mach)

For improved composability, and better data hygiene, an extensive number of meta-algorithms are implemented as model wrappers:

Hyperparameter tuning
Control of iterative models
Correction for class imbalance
Homogeneous model ensembling
Wolpert model stacking
Recursive feature elimination
Target transforms / inverse transforms
Thresholding probabilistic predictors

In this way, a model wrapped in a tuning strategy, for example, becomes a "self-tuning" model, with all data resampling (e.g., cross-validation) managed under the hood.

Smart Pipelines

# create and train a pipeline model:
pipe = OneHotEncoder() |> PCA(maxout=3) |> DecisionTreeClassifier()
mach = machine(pipe, X, y) |> fit!

# get actual PCA reduction dimension:
report(mach).pca.outdim

# get the tree:
fitted_params(mach).decision_tree_classifier.tree

Conventional model pipelines are available out-of-the box. Hyper-parameters of different model components can be simultaneously tuned, but only necessary components are retrained in each pipeline evaluation. Training reports expose reports for individual components, and the same holds for learned parameters.

Nested Tuning

# create pipeline:
julia> pipe = ContinuousEncoder() |> RidgeRegressor()
  DeterministicPipeline(
      continuous_encoder = ContinuousEncoder(
          drop_last = false,
          one_hot_ordered_factors = false),
      ridge_regressor = RidgeRegressor(
          lambda = 1.0,
          fit_intercept = true,
          penalize_intercept = false,
          scale_penalty_with_samples = true,
          solver = nothing),
      cache = true)

# define one or more hyperparameter ranges:
julia> r = range(pipe, :(ridge_regressor.lambda), lower=0.001, upper=10.0)

# create self-tuning version of pipeline:
julia> tuned_model = TunedModel(pipe, range=r, resampling=CV(), measure=l2)

Creating pipelines, or wrapping models in meta-algorithms, such as iteration control, creates nested hyper-parameters. Such parameters can be optimized like any other.

Learning Networks

# wrap data in "source nodes":
X, y = source.(X, y)

# a normal MLJ workflow, with `fit` calls omitted:
mach1 = machine(model1, X, y)
mach2 = machine(model2, X, y)
y1 = predict(mach1, X) # a callable "node"
y2 = predict(mach2, X)
y = 0.5*(y1 + y2)

# train all models with one call:
fit!(y, acceleration=CPUThreads())

# blended prediction for new data:
y(Xnew)

In principle, any MLJ workflow is readily transformed into a lazily executed learning network.

For example, in the code block opposite, fit! triggers training of both models in parallel, and the last line returns the average prediction. Mutate a hyper-parameter of model1, call fit! again, and only model1 is retrained.

Learning networks can be exported as new stand-alone model types. Internally, MLJ's pipelines and stacks are implemented using learning networks, which demonstrates their flexibility.

Iteration control

# choose an iterative model:
model = EvoTreeRegressor() # with iteration parameter `nrounds`

# choose iteration controls:
losses = []
controls = [Step(1), Patience(5), WithLossDo(x->push!(losses,x))]

iterated_model = IteratedModel(
    model;
    controls,
    measure=l2,
    resampling=Holdout(),
    retrain=true,
)

# train on internal holdout to find `nrounds` and retrain on all data:
mach = machine(iterated_mode, X, y) |> fit!

# predict on new data:
predict(mach, Xnew)

MLJ provides a rich supply of iterative model "controls", such as early stopping criteria, snapshots, and callbacks for visualization. Any model with an iteration parameter can be wrapped in such controls, the iteration parameter becoming an additional learned parameter.

Machine Learning in Julia

Get Started with MLJ