Zero-shot classification with transformers pipeline
Quick answer
Use the Hugging Face transformers library's pipeline with the zero-shot-classification task to classify text into labels without prior training. Initialize the pipeline with a model like facebook/bart-large-mnli and call it with your text and candidate labels.
PREREQUISITES
Python 3.8+pip install transformers>=4.30.0pip install torch (or tensorflow)Internet connection to download model weights
Setup
Install the transformers library and a backend like torch or tensorflow. The pipeline API simplifies zero-shot classification with pretrained models.
pip install transformers torch output
Collecting transformers Collecting torch Successfully installed transformers-4.30.0 torch-2.0.1
Step by step
Initialize the zero-shot classification pipeline with a suitable model, then classify your input text against candidate labels.
from transformers import pipeline
# Initialize zero-shot classification pipeline
classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')
# Input text to classify
sequence_to_classify = "I love programming in Python and exploring AI models."
# Candidate labels
candidate_labels = ['technology', 'sports', 'politics', 'education']
# Perform classification
result = classifier(sequence_to_classify, candidate_labels)
print('Labels:', result['labels'])
print('Scores:', result['scores']) output
Labels: ['technology', 'education', 'sports', 'politics'] Scores: [0.987, 0.012, 0.0005, 0.0003]
Common variations
- Use
multi_label=Trueto allow multiple labels per input. - Switch models to
roberta-large-mnliorfacebook/bart-large-mnlidepending on accuracy and speed needs. - Use async pipelines with
transformers>=4.30.0for concurrency.
from transformers import pipeline
# Multi-label classification example
classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')
sequence = "The government passed a new education reform bill."
labels = ['politics', 'education', 'technology']
result = classifier(sequence, labels, multi_label=True)
print(result) output
{'sequence': 'The government passed a new education reform bill.', 'labels': ['education', 'politics', 'technology'], 'scores': [0.85, 0.75, 0.05]} Troubleshooting
- If you get
OSError: Model name 'facebook/bart-large-mnli' was not found, ensure you have internet access to download the model or pre-download it manually. - For slow inference, consider using a smaller model or running on GPU.
- If
pipelineimport fails, verifytransformersis installed and updated.
Key Takeaways
- Use Hugging Face transformers pipeline with 'zero-shot-classification' for quick text classification without training.
- The 'facebook/bart-large-mnli' model is a reliable default for zero-shot tasks.
- Enable multi-label classification with the 'multi_label=True' parameter when needed.
- Install 'transformers' and a backend like 'torch' to run the pipeline.
- Troubleshoot model download and environment issues by checking internet and package versions.