How to Intermediate · 3 min read

How to detect model drift in production

Quick answer
Detect model drift in production by continuously monitoring input data distributions and model performance metrics using statistical tests like KL divergence or population stability index. Implement automated alerts when significant deviations occur to trigger retraining or investigation.

PREREQUISITES

  • Python 3.8+
  • pip install numpy scipy scikit-learn matplotlib
  • Access to production model predictions and input data logs

Setup

Install necessary Python libraries for data analysis and monitoring:

  • numpy for numerical operations
  • scipy for statistical tests
  • scikit-learn for metrics
  • matplotlib for visualization
bash
pip install numpy scipy scikit-learn matplotlib

Step by step

This example demonstrates detecting model drift by comparing feature distributions and monitoring model accuracy over time.

python
import numpy as np
from scipy.stats import ks_2samp
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Simulated baseline data (training set features)
baseline_features = np.random.normal(loc=0, scale=1, size=1000)

# Simulated production data (current features)
production_features = np.random.normal(loc=0.5, scale=1.2, size=1000)

# Step 1: Statistical test for feature distribution drift (Kolmogorov-Smirnov test)
ks_stat, p_value = ks_2samp(baseline_features, production_features)
print(f"KS statistic: {ks_stat:.3f}, p-value: {p_value:.3f}")

# Step 2: Simulated model predictions and true labels
# Baseline accuracy
baseline_preds = (baseline_features > 0).astype(int)
baseline_labels = (baseline_features > 0).astype(int)
baseline_acc = accuracy_score(baseline_labels, baseline_preds)

# Production accuracy (simulate drift impact)
production_preds = (production_features > 0).astype(int)
production_labels = (production_features > 0.3).astype(int)  # Slight label shift
production_acc = accuracy_score(production_labels, production_preds)

print(f"Baseline accuracy: {baseline_acc:.3f}")
print(f"Production accuracy: {production_acc:.3f}")

# Step 3: Visualize feature distributions
plt.hist(baseline_features, bins=30, alpha=0.5, label='Baseline')
plt.hist(production_features, bins=30, alpha=0.5, label='Production')
plt.legend()
plt.title('Feature Distribution Comparison')
plt.show()
output
KS statistic: 0.193, p-value: 0.000
Baseline accuracy: 1.000
Production accuracy: 0.835

Common variations

To enhance drift detection:

  • Use Population Stability Index (PSI) for feature distribution shifts.
  • Monitor multiple features and aggregate drift scores.
  • Track model confidence scores and prediction distributions.
  • Implement asynchronous monitoring pipelines with streaming data.
  • Use specialized drift detection libraries like alibi-detect or evidently.

Troubleshooting

If you see false positives in drift detection:

  • Check if data sampling is representative and consistent.
  • Adjust statistical test thresholds to balance sensitivity and false alarms.
  • Ensure labels used for performance monitoring are accurate and timely.
  • Validate data preprocessing consistency between baseline and production.

Key Takeaways

  • Continuously monitor input data distributions and model performance metrics to detect drift early.
  • Use statistical tests like Kolmogorov-Smirnov or PSI to quantify distribution changes.
  • Automate alerts and retraining triggers based on drift detection to maintain model reliability.
Verified 2026-04
Verify ↗