How to plot ROC curve in Scikit-learn
Quick answer
Use
sklearn.metrics.roc_curve to compute false positive rate and true positive rate, then plot them with matplotlib.pyplot. Fit a classifier, get predicted probabilities, and pass true labels and probabilities to roc_curve for plotting the ROC curve.PREREQUISITES
Python 3.8+pip install scikit-learn matplotlib numpy
Setup
Install required libraries with pip if not already installed:
pip install scikit-learn matplotlib numpypip install scikit-learn matplotlib numpy Step by step
This example trains a logistic regression classifier on a binary classification dataset, computes the ROC curve, and plots it using matplotlib.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
# Generate synthetic binary classification data
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Train logistic regression classifier
model = LogisticRegression(solver='liblinear')
model.fit(X_train, y_train)
# Predict probabilities for positive class
y_scores = model.predict_proba(X_test)[:, 1]
# Compute ROC curve and ROC area
fpr, tpr, thresholds = roc_curve(y_test, y_scores)
roc_auc = auc(fpr, tpr)
# Plot ROC curve
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc='lower right')
plt.grid(True)
plt.show() Common variations
- Use
roc_auc_scorefromsklearn.metricsto get a scalar AUC value directly. - Plot ROC curves for multi-class classification using one-vs-rest strategy.
- Use other classifiers like
RandomForestClassifierorXGBClassifierwith the same approach.
from sklearn.metrics import roc_auc_score
# Calculate AUC score directly
auc_score = roc_auc_score(y_test, y_scores)
print(f'AUC score: {auc_score:.2f}') output
AUC score: 0.91
Troubleshooting
- If
predict_probais not available, usedecision_functionfor classifiers like SVM. - Ensure binary labels are 0 and 1; otherwise, binarize labels before ROC computation.
- If ROC curve looks like a diagonal line, check if predictions are random or model is not trained properly.
Key Takeaways
- Use
roc_curvewith true labels and predicted probabilities to get FPR and TPR for ROC plotting. - Plot ROC curve with
matplotlibfor visual evaluation of binary classifiers. - Use
roc_auc_scoreto quantify classifier performance with a single scalar. - For classifiers without
predict_proba, usedecision_functionscores instead. - Ensure labels are binary and properly formatted before ROC computation.