For a condensed summary of information presented here, do `using MLJ; @doc MLJ`. Repositories of some possible interest outside of MLJ, or beyond its conventional use, are marked with a ⟂ symbol.
MLJ.jl is the general user's point-of-entry for choosing, loading,
composing, evaluating and tuning machine learning models.
It pulls in most code from other repositories described below.
MLJ also hosts the MLJ manual which documents functionality across the repositories,
although some pages point to documentation hosted locally by a particular package.
ModelInterface MLJModelInterface.jl is a lightweight package imported by packages
implementing MLJ's interface for their machine learning models.
It depends on ScientificTypesBase.jl and
StatisticalTraits.jl (which depends only on ScientificTypesBase.jl).
Base MLJBase.jl is a large repository with two main purposes:
(i) to give "dummy" methods defined in MLJModelInterface their intended functionality (which depends on third party packages,
such as Tables.jl, Distributions.jl and CategoricalArrays.jl; and
(ii) provide functionality essential to the MLJ user that has not been relegated to its own
"satellite" repository for some reason.
See the MLJBase.jl README.md for a detailed description of MLJBase's contents.
StatisticalMeasures.jl provides performance measures (metrics) such as losses and scores.
Models MLJModels.jl hosts the MLJ model registry,
which contains metadata on all the models the MLJ user can search and load from MLJ. Moreover,
it provides the functionality for loading model code from MLJ on demand. Finally, it furnishes some
commonly used transformers for data pre-processing, such as ContinuousEncoder and Standardizer.
Tuning MLJTuning.jl provides MLJ's TunedModel wrapper for hyper-parameter optimization,
including the extendable API for tuning strategies, and selected in-house implementations, such as Grid and RandomSearch.
Ensembles MLJEnsembles.jl provides MLJ's EnsembleModel wrapper, for creating homogeneous model ensembles.
Iteration MLJIteration.jl provides the IteratedModel wrapper for controlling iterative models (snapshots, early stopping criteria, etc).
FeatureSelection.jl provides models for choosing features, and includes the RecursiveFeatureElimination wrapper.
Flow MLJFlow.jl provides integration with the platform-agnostic machine learning tracking tool MLflow (mlflow.org).
OpenML.jl provides integration with the OpenML (www.openml.org) data science exchange platform.
LinearModels MLJLinearModels.jl provides a wide range of julia-native penalized linear models such as Lasso, Elastic-Net, Robust regression, LAD regression, etc.
Flux MLJFlux.jl provides support for some neural network models,
built with Flux.jl.
ScientificTypesBase.jl is an ultra lightweight package providing "scientific" types, such as
Continuous, OrderedFactor, Image and Table. It's purpose is to formalize conventions around the
scientific interpretation of ordinary machine types, such as Float32 and DataFrame.
ScientificTypes.jl articulates the particular convention for the scientific interpretation of data that MLJ adopts.
StatisticalTraits.jl An ultra lightweight package defining fall-back implementations for
a collection of traits possessed by statistical objects, principally models and measures (metrics).
DataScienceTutorials collects tutorials on how to use MLJ, which are deployed at JuliaAI.github.io/DataScienceTutorials.jl/.
TestInterface MLJTestInterface provides tests for implementations of the MLJ model interface.
TestIntegration MLJTestIntegration provides tests for the entire MLJ ecosystem. (Called when you run `ENV["MLJ_TEST_INTEGRATION"]="true";
Pkg.test("MLJ")`).