MLflow is a popular open source solution for managing all aspects of the machine learning lifecycle. The platform encompasses four components:
- MLflow Tracking to record code, data, configuration, and results of ML experiments
- MLflow Projects to package data science code in a format that allows it to run reproducibly in different environments
- MLflow Models to deploy ML models in different environments
- MLflow Model Registry to store and manage ML models in a central repository
To learn more about MLflow and its capabilities, see the MLflow documentation.
Reporting Anovos data to MLflow Tracking
Anovos integrates with MLflow by reporting workflow metadata and results to MLflow Tracking.
To track your workflows with MLflow, add an
mlflow block to your workflow configuration file:
mlflow: experiment: "Anovos" # The name of the MLflow experiment associated with your workflow tracking_uri: "http://127.0.0.1:8889" # The URL of the MLflow Tracking server track_output: True # Store the workflow output (i.e., resulting dataset(s)) track_reports: True # Store the generated reports track_intermediates: False # Store any intermediate data generated by your workflow
It is currently not possible to select which intermediate outputs are stored.
track_intermediate is set to
True, all intermediate outputs will be stored.
Using MLflow on Azure Databricks
If you are running Anovos workloads on Azure Databricks, you can use the integrated Managed MLflow to track your Anovos runs and artifacts.
To learn more about moving your Anovos workloads to Azure Databricks, see the 📖 Setting up Anovos on Azure Databricks guide.
To track an Anovos workflow with Managed MLflow, you first need to create a new MLflow experiment. This is possible either through the Databricks Machine Learning UI or the MLflow API. Please refer to the Azure Databricks documentation for detailed and up-to-date instructions.
Once you have created an experiment for your workflow, you can then use its "Location" as the
in the Anovos workflow configuration's
mlflow config block.
tracking_uri needs to be set to
mlflow: experiment: "/Users/your_user_name@your_domain.tld/your_experiment_name" tracking_uri: "databricks" track_output: True # Store the workflow output (i.e., resulting dataset(s)) track_reports: True # Store the generated reports track_intermediates: False # Store any intermediate data generated by your workflow
We're exploring integration of Anovos with MLflow Projects and MLFlow Pipelines. Let us know which capabilities you'd like to see in future versions of Anovos!