Code Intermediate medium · 5 min

pipeline("text-classification")

What you will learn

Use the text-classification pipeline to instantly classify text into predefined categories without writing a single training loop.

Why this matters

Text classification is one of the most common NLP tasks in production (spam detection, sentiment analysis, intent detection). The pipeline abstracts away tokenization, model loading, and inference so you can ship working classification in under a minute.

Skip if: Don't use pipeline() when you need to fine-tune on custom data, control batch processing for performance, or integrate with existing inference infrastructure. Use raw model loading and custom inference instead.

Explanation

What it is: pipeline("text-classification") is a high-level API that wraps a pre-trained transformer model, tokenizer, and inference logic into a single callable function. You pass text, get back class predictions with confidence scores. How it works mechanically: The pipeline loads a default model (distilbert-base-uncased-finetuned-sst-2-english by default), instantiates a tokenizer, converts your input text to token IDs, runs inference on the model, and applies softmax to produce class probabilities. All of this happens behind the scenes in a single function call. When to use it: Use pipeline for quick prototyping, demos, and production inference on standard tasks where the default model works well. For custom or performance-critical applications, drop down to manual model/tokenizer loading.

Analogy

It's like ordering food at a restaurant versus cooking from scratch. The pipeline is the restaurant: you describe what you want (text), it handles everything behind the counter (tokenization, model inference), and hands you the result. Manual model loading is cooking from scratch: more control, but more work.

Code

python

import torch
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="distilbert-base-uncased-finetuned-sst-2-english",
    device=0 if torch.cuda.is_available() else -1
)

texts = [
    "I absolutely loved that movie! Best film of the year.",
    "This product is terrible. Complete waste of money.",
    "The weather is nice today."
]

results = classifier(texts)

for text, result in zip(texts, results):
    print(f"Text: {text}")
    print(f"Label: {result['label']}, Score: {result['score']:.4f}")
    print()

Output

Text: I absolutely loved that movie! Best film of the year.
Label: POSITIVE, Score: 0.9998

Text: This product is terrible. Complete waste of money.
Label: NEGATIVE, Score: 0.9997

Text: The weather is nice today.
Label: POSITIVE, Score: 0.9954

What just happened?

The code instantiated a text-classification pipeline pointing to a specific pre-trained model. It then passed three pieces of text through that pipeline. Each text was internally tokenized, fed through the distilbert model, and converted to a softmax probability distribution over [NEGATIVE, POSITIVE]. The results dictionary contains the top predicted class label and its confidence score for each input text.

Common gotcha

Developers often call pipeline("text-classification") without pinning a model name and then ship different models to production than they tested. Always explicitly specify the model= parameter. The default model changes across versions. Also, pipeline() does NOT batch efficiently by default: if you pass a list of 1000 texts, it will process them sequentially. Use a custom batch loop with the underlying model for high throughput.

Error recovery

ImportError: No module named 'transformers'

Install with: pip install transformers torch

torch.cuda.OutOfMemoryError

Pass device=-1 to use CPU, or use device_map='auto' with a quantized model: pipeline(..., device_map='auto', model_kwargs={'torch_dtype': torch.float16})

OSError: Can't load model. Model not found

The model name doesn't exist on Hugging Face Hub. Verify the exact model ID by searching huggingface.co/models

ValueError: The parameter device should be an integer

In transformers 5.5.x, pass device as integer (0, 1, etc. for GPU ID) or -1 for CPU. Do not pass 'cuda' or 'cpu' strings.

Experienced dev note

The pipeline() API is a convenience wrapper. It's fast for prototyping but hides important details: you don't see the tokenizer's max_length, you can't control padding strategy, and you can't batch efficiently. In production, you'll often drop down to manual model/tokenizer loading after the pipeline proves the concept works. Also: in transformers 5.5.x, the default device logic changed. Always explicitly set device or use device_map='auto' to avoid platform-specific surprises.

Check your understanding

Why would passing a list of 100 texts to the same pipeline twice produce identical outputs, and why might passing the same 100 texts split across multiple batches to a custom inference loop produce different outputs?

Show answer hint

The correct answer recognizes that the pipeline is deterministic for the same input, but a custom batched loop might encounter different padding or truncation behavior depending on batch composition, and floating-point rounding differences in softmax can shift which class gets the highest score on edge cases.

VERSION In transformers < 4.25.0, pipeline() required you to manually handle device placement. In 4.25.0+, device_map='auto' became available. In transformers 5.5.x (April 2026), the default device logic is more robust, but explicitly setting device or device_map is strongly recommended to avoid version-dependent behavior.

Explore how to load a model and tokenizer separately for finer control over inference, enabling batch processing and custom token handling beyond what pipeline() provides.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.