How to beginner · 3 min read

How to classify text with Hugging Face

Q: How to classify text with Hugging Face

Use the Hugging Face transformers library with a pretrained text classification model like distilbert-base-uncased-finetuned-sst-2-english. Load the model and tokenizer, then run your text through the pipeline to get classification labels and scores.

Quick answer

Use the Hugging Face transformers library with a pretrained text classification model like distilbert-base-uncased-finetuned-sst-2-english. Load the model and tokenizer, then run your text through the pipeline to get classification labels and scores.

PREREQUISITES

Python 3.8+
pip install transformers>=4.30.0
pip install torch (or tensorflow)
Internet connection for model download

Setup

Install the Hugging Face transformers library and a backend like torch or tensorflow. The transformers package provides pretrained models and pipelines for text classification.

bash

pip install transformers torch

output

Collecting transformers
Collecting torch
Successfully installed torch-2.0.1 transformers-4.32.1

Step by step

Use the pipeline API for text classification. This example uses the distilbert-base-uncased-finetuned-sst-2-english model for sentiment classification.

python

from transformers import pipeline

# Initialize text classification pipeline
classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Input text
text = "I love using Hugging Face Transformers!"

# Run classification
result = classifier(text)
print(result)

output

[{'label': 'POSITIVE', 'score': 0.9998}]

Common variations

Use other pretrained models like bert-base-uncased-finetuned-mrpc for paraphrase detection.
Run classification on a batch of texts by passing a list to the pipeline.
Use async pipelines with transformers>=4.32.0 and asyncio for concurrency.

python

import asyncio
from transformers import pipeline

async def async_classify(texts):
    classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")
    results = await classifier(texts, return_all_scores=True)
    return results

texts = ["I love this!", "This is bad."]

async def main():
    results = await async_classify(texts)
    print(results)

asyncio.run(main())

output

[[{'label': 'NEGATIVE', 'score': 0.001}, {'label': 'POSITIVE', 'score': 0.999}], [{'label': 'NEGATIVE', 'score': 0.998}, {'label': 'POSITIVE', 'score': 0.002}]]

Troubleshooting

If you get OSError: model not found, ensure you have internet access or download the model manually.
For CUDA errors, verify your PyTorch installation matches your GPU setup.
If classification results seem incorrect, try a different pretrained model suited for your task.

✅

Key Takeaways

Use Hugging Face pipeline for quick text classification with pretrained models.
Pass single or batch texts to the pipeline for flexible classification.
Async pipelines enable concurrent classification for improved throughput.
Choose a model pretrained on your specific classification task for best accuracy.

Verified 2026-04 · distilbert-base-uncased-finetuned-sst-2-english, bert-base-uncased-finetuned-mrpc

Verify ↗