How to detect model drift in production
Quick answer
Detect model drift in production by continuously monitoring input data distributions and model performance metrics using statistical tests like KL divergence or population stability index. Implement automated alerts when significant deviations occur to trigger retraining or investigation.
PREREQUISITES
Python 3.8+pip install numpy scipy scikit-learn matplotlibAccess to production model predictions and input data logs
Setup
Install necessary Python libraries for data analysis and monitoring:
numpyfor numerical operationsscipyfor statistical testsscikit-learnfor metricsmatplotlibfor visualization
pip install numpy scipy scikit-learn matplotlib Step by step
This example demonstrates detecting model drift by comparing feature distributions and monitoring model accuracy over time.
import numpy as np
from scipy.stats import ks_2samp
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
# Simulated baseline data (training set features)
baseline_features = np.random.normal(loc=0, scale=1, size=1000)
# Simulated production data (current features)
production_features = np.random.normal(loc=0.5, scale=1.2, size=1000)
# Step 1: Statistical test for feature distribution drift (Kolmogorov-Smirnov test)
ks_stat, p_value = ks_2samp(baseline_features, production_features)
print(f"KS statistic: {ks_stat:.3f}, p-value: {p_value:.3f}")
# Step 2: Simulated model predictions and true labels
# Baseline accuracy
baseline_preds = (baseline_features > 0).astype(int)
baseline_labels = (baseline_features > 0).astype(int)
baseline_acc = accuracy_score(baseline_labels, baseline_preds)
# Production accuracy (simulate drift impact)
production_preds = (production_features > 0).astype(int)
production_labels = (production_features > 0.3).astype(int) # Slight label shift
production_acc = accuracy_score(production_labels, production_preds)
print(f"Baseline accuracy: {baseline_acc:.3f}")
print(f"Production accuracy: {production_acc:.3f}")
# Step 3: Visualize feature distributions
plt.hist(baseline_features, bins=30, alpha=0.5, label='Baseline')
plt.hist(production_features, bins=30, alpha=0.5, label='Production')
plt.legend()
plt.title('Feature Distribution Comparison')
plt.show() output
KS statistic: 0.193, p-value: 0.000 Baseline accuracy: 1.000 Production accuracy: 0.835
Common variations
To enhance drift detection:
- Use
Population Stability Index (PSI)for feature distribution shifts. - Monitor multiple features and aggregate drift scores.
- Track model confidence scores and prediction distributions.
- Implement asynchronous monitoring pipelines with streaming data.
- Use specialized drift detection libraries like
alibi-detectorevidently.
Troubleshooting
If you see false positives in drift detection:
- Check if data sampling is representative and consistent.
- Adjust statistical test thresholds to balance sensitivity and false alarms.
- Ensure labels used for performance monitoring are accurate and timely.
- Validate data preprocessing consistency between baseline and production.
Key Takeaways
- Continuously monitor input data distributions and model performance metrics to detect drift early.
- Use statistical tests like Kolmogorov-Smirnov or PSI to quantify distribution changes.
- Automate alerts and retraining triggers based on drift detection to maintain model reliability.