How to beginner · 3 min read

Precision recall F1 for classification

Q: Precision recall F1 for classification

Use precision, recall, and f1_score metrics to evaluate classification models. These metrics measure the accuracy of positive predictions, the ability to find all positive samples, and their harmonic mean respectively, and can be computed easily with scikit-learn.

Quick answer

Use precision, recall, and f1_score metrics to evaluate classification models. These metrics measure the accuracy of positive predictions, the ability to find all positive samples, and their harmonic mean respectively, and can be computed easily with scikit-learn.

PREREQUISITES

Python 3.8+
pip install scikit-learn

Setup

Install scikit-learn if you haven't already. This library provides built-in functions to calculate precision, recall, and F1 score for classification tasks.

bash

pip install scikit-learn

output

Requirement already satisfied: scikit-learn in /usr/local/lib/python3.10/site-packages (1.3.0)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.10/site-packages (from scikit-learn) (1.25.0)
Requirement already satisfied: scipy>=1.5.0 in /usr/local/lib/python3.10/site-packages (from scikit-learn) (1.11.1)

Step by step

Use precision_score, recall_score, and f1_score from sklearn.metrics to compute these metrics from true and predicted labels.

python

from sklearn.metrics import precision_score, recall_score, f1_score

# Example true labels and predicted labels
true_labels = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]
pred_labels = [1, 0, 0, 1, 0, 1, 1, 0, 1, 0]

# Calculate precision, recall, and F1 score
precision = precision_score(true_labels, pred_labels)
recall = recall_score(true_labels, pred_labels)
f1 = f1_score(true_labels, pred_labels)

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")

output

Precision: 0.80
Recall: 0.80
F1 Score: 0.80

Common variations

You can compute metrics for multi-class classification by specifying the average parameter (e.g., macro, micro, weighted). For example, f1_score(y_true, y_pred, average='macro').

Also, you can use these metrics asynchronously or integrate them with AI model evaluation pipelines.

python

from sklearn.metrics import f1_score

# Multi-class example
true_labels = [0, 1, 2, 2, 1]
pred_labels = [0, 2, 2, 2, 0]

f1_macro = f1_score(true_labels, pred_labels, average='macro')
f1_micro = f1_score(true_labels, pred_labels, average='micro')

print(f"F1 Macro: {f1_macro:.2f}")
print(f"F1 Micro: {f1_micro:.2f}")

output

F1 Macro: 0.53
F1 Micro: 0.60

Troubleshooting

If you get a UndefinedMetricWarning, it means there are no positive predictions for a class; consider setting zero_division=0 in metric functions.
Ensure your true and predicted label arrays have the same length and correct label encoding.

python

precision = precision_score(true_labels, pred_labels, zero_division=0)
recall = recall_score(true_labels, pred_labels, zero_division=0)
f1 = f1_score(true_labels, pred_labels, zero_division=0)

output

Precision: 0.80
Recall: 0.80
F1 Score: 0.80

✅

Key Takeaways

Use scikit-learn metrics to easily compute precision, recall, and F1 for classification.
Specify average parameter for multi-class classification metrics.
Handle zero division warnings by setting zero_division=0 in metric functions.

Verified 2026-04

Verify ↗