logo

Machine Learning in Julia

A Julia package for general, composable machine learning at scale.

Get Started with MLJ

Evaluate the performance of a model and tune its hyper-parameters

  1. Install the MLJ Package
  2. Load Some Data
  3. Train Your First Model
  4. Evaluate Your Model
  5. Hyperparameter Tuning
using Pkg

# Create a new environment (optional):
Pkg.activate("mlj-env", shared=true)

# Install MLJ
Pkg.add("MLJ")

# Install RDatasets (needed for demo):
Pkg.add("RDatasets")

# Use these packages:
using MLJ, RDatasets

MLJ Features

  • Choosing Models
  • Meta-algorithms
  • Smart Pipelines
  • Nested Tuning
  • Learning Networks
  • Iteration control

Choosing Models

# identify models suitable for your data
julia> X, y = @load_iris
julia> models(matching(X, y))
54-element Vector
 (name = AdaBoostClassifier, package_name = MLJScikitLearnInterface, ... )
 (name = AdaBoostStumpClassifier, package_name = DecisionTree, ... )
 (name = BaggingClassifier, package_name = MLJScikitLearnInterface, ... )

# filter by docstring content:
julia> models("pca")
(name = PCA, package_name = MultivariateStats, ... )
(name = PCADetector, package_name = OutlierDetectionPython, ... )

# retrieve docs without code import:
julia> doc("PCA", pkg="MultivariateStats")

A Model Registry stores detailed metadata for over 200 models and documentation can be searched without loading model code.

Meta-algorithms

# choose a model and define hyperparameter ranges for tuning:
model = XGBoostRegressor()
r1 = range(model, :max_depth, lower=3, upper=10)
r2 = range(model, :gamma, lower=0, upper=10, scale=:log)

# wrap model to create self-tuning version:
tuned_model = TunedModel(model, range=[r1, r2], resampling=CV(), measure=l2)

# this both optimises and retrains on all data:
mach = machine(tuned_model, data) |> fit!

# predict using optimised params:
predict(mach, Xnew)

# inspect optimisation outcomes
report(mach).best_model # inspect optim. results
plot(mach)

For improved composability, and better data hygiene, an extensive number of meta-algorithms are implemented as model wrappers:


In this way, a model wrapped in a tuning strategy, for example, becomes a "self-tuning" model, with all data resampling (e.g., cross-validation) managed under the hood.

Smart Pipelines

# create and train a pipeline model:
pipe = OneHotEncoder() |> PCA(maxout=3) |> DecisionTreeClassifier()
mach = machine(pipe, X, y) |> fit!

# get actual PCA reduction dimension:
report(mach).pca.outdim

# get the tree:
fitted_params(mach).decision_tree_classifier.tree

Conventional model pipelines are available out-of-the box. Hyper-parameters of different model components can be simultaneously tuned, but only necessary components are retrained in each pipeline evaluation. Training reports expose reports for individual components, and the same holds for learned parameters.

Nested Tuning

# create pipeline:
julia> pipe = ContinuousEncoder() |> RidgeRegressor()
  DeterministicPipeline(
      continuous_encoder = ContinuousEncoder(
          drop_last = false,
          one_hot_ordered_factors = false),
      ridge_regressor = RidgeRegressor(
          lambda = 1.0,
          fit_intercept = true,
          penalize_intercept = false,
          scale_penalty_with_samples = true,
          solver = nothing),
      cache = true)

# define one or more hyperparameter ranges:
julia> r = range(pipe, :(ridge_regressor.lambda), lower=0.001, upper=10.0)

# create self-tuning version of pipeline:
julia> tuned_model = TunedModel(pipe, range=r, resampling=CV(), measure=l2)

Creating pipelines, or wrapping models in meta-algorithms, such as iteration control, creates nested hyper-parameters. Such parameters can be optimized like any other.

Learning Networks

# wrap data in "source nodes":
X, y = source.(X, y)

# a normal MLJ workflow, with `fit` calls omitted:
mach1 = machine(model1, X, y)
mach2 = machine(model2, X, y)
y1 = predict(mach1, X) # a callable "node"
y2 = predict(mach2, X)
y = 0.5*(y1 + y2)

# train all models with one call:
fit!(y, acceleration=CPUThreads())

# blended prediction for new data:
y(Xnew)

In principle, any MLJ workflow is readily transformed into a lazily executed learning network.

For example, in the code block opposite, fit! triggers training of both models in parallel, and the last line returns the average prediction. Mutate a hyper-parameter of model1, call fit! again, and only model1 is retrained.

Learning networks can be exported as new stand-alone model types. Internally, MLJ's pipelines and stacks are implemented using learning networks, which demonstrates their flexibility.

Iteration control

# choose an iterative model:
model = EvoTreeRegressor() # with iteration parameter `nrounds`

# choose iteration controls:
losses = []
controls = [Step(1), Patience(5), WithLossDo(x->push!(losses,x))]

iterated_model = IteratedModel(
    model;
    controls,
    measure=l2,
    resampling=Holdout(),
    retrain=true,
)

# train on internal holdout to find `nrounds` and retrain on all data:
mach = machine(iterated_mode, X, y) |> fit!

# predict on new data:
predict(mach, Xnew)

MLJ provides a rich supply of iterative model "controls", such as early stopping criteria, snapshots, and callbacks for visualization. Any model with an iteration parameter can be wrapped in such controls, the iteration parameter becoming an additional learned parameter.

MLJ Partners