How to use AutoModelForSequenceClassification
Quick answer
Use
AutoModelForSequenceClassification from Hugging Face Transformers to load a pretrained model for text classification by specifying the model name or path. Combine it with AutoTokenizer to preprocess input text, then pass tokenized inputs to the model to get classification logits.PREREQUISITES
Python 3.8+pip install transformers torchBasic knowledge of PyTorch
Setup
Install the transformers and torch libraries to use Hugging Face models. Set up your Python environment with the required packages.
pip install transformers torch Step by step
This example shows how to load a pretrained sequence classification model and tokenizer, tokenize input text, and get classification logits.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load pretrained model and tokenizer
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Prepare input text
texts = ["I love using Hugging Face!", "This is a bad example."]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
# Forward pass
outputs = model(**inputs)
logits = outputs.logits
# Convert logits to probabilities
probs = torch.nn.functional.softmax(logits, dim=-1)
print("Logits:", logits)
print("Probabilities:", probs) output
Logits: tensor([[ 3.2, -3.1],
[-2.9, 2.7]])
Probabilities: tensor([[9.98e-01, 2.02e-03],
[1.90e-03, 9.98e-01]]) Common variations
- Use different pretrained models by changing
model_nameto any sequence classification checkpoint on Hugging Face Hub. - Run inference asynchronously using
asynciowith PyTorch's async support. - Use
TrainerAPI for fine-tuning the model on your own dataset.
Troubleshooting
- If you get a
ModelNotFoundError, verify the model name is correct and available on Hugging Face Hub. - For CUDA errors, ensure PyTorch is installed with GPU support and your GPU drivers are up to date.
- If tokenization fails, check that input texts are strings and tokenizer is properly loaded.
Key Takeaways
- Use
AutoModelForSequenceClassification.from_pretrained()to load any pretrained classification model. - Always pair the model with the matching
AutoTokenizerfor correct input preprocessing. - Pass tokenized inputs as keyword arguments to the model to get logits for classification.
- Convert logits to probabilities with softmax for interpretable outputs.
- Check model availability and environment setup to avoid common errors.