How to use MLflow for experiment tracking
Quick answer
Use
MLflow to track machine learning experiments by logging parameters, metrics, and artifacts within an mlflow.start_run() context. This enables centralized experiment tracking, comparison, and reproducibility.PREREQUISITES
Python 3.8+pip install mlflow>=2.0Basic knowledge of Python and machine learning
Setup
Install mlflow via pip and set up a local tracking server or use the default local file-based tracking. No environment variables are required for local tracking.
pip install mlflow Step by step
This example shows how to track an experiment by logging parameters, metrics, and artifacts using mlflow in Python.
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)
# Start MLflow run
with mlflow.start_run():
# Define and train model
clf = RandomForestClassifier(n_estimators=100, max_depth=3, random_state=42)
clf.fit(X_train, y_train)
# Predict and evaluate
preds = clf.predict(X_test)
acc = accuracy_score(y_test, preds)
# Log parameters
mlflow.log_param("n_estimators", 100)
mlflow.log_param("max_depth", 3)
# Log metric
mlflow.log_metric("accuracy", acc)
# Log model
mlflow.sklearn.log_model(clf, "random_forest_model")
print(f"Logged run with accuracy: {acc:.4f}") output
Logged run with accuracy: 1.0000
Common variations
- Use
mlflow.start_run(run_name="my_run")to name runs. - Set up a remote tracking server by configuring
MLFLOW_TRACKING_URIenvironment variable. - Log additional artifacts like plots or datasets with
mlflow.log_artifact(). - Use
mlflow.projectsto run reproducible projects.
import os
import mlflow
# Set remote tracking URI
os.environ["MLFLOW_TRACKING_URI"] = "http://your-mlflow-server:5000"
with mlflow.start_run(run_name="remote_run"):
mlflow.log_param("example", "remote")
mlflow.log_metric("metric", 0.9)
print("Logged to remote tracking server") output
Logged to remote tracking server
Troubleshooting
- If you see
ConnectionError, verify yourMLFLOW_TRACKING_URIand network access. - If runs do not appear in UI, check that the tracking server is running and accessible.
- For permission errors, ensure correct file system permissions or server authentication.
Key Takeaways
- Use
mlflow.start_run()to create experiment runs and log parameters, metrics, and models. - Set
MLFLOW_TRACKING_URIto switch between local and remote tracking servers. - Log artifacts and models to keep all experiment outputs organized and reproducible.