How to beginner · 3 min read

How to load a pretrained model from Hugging Face

Quick answer
Use the transformers library to load a pretrained model from Hugging Face by calling AutoModel.from_pretrained() and AutoTokenizer.from_pretrained() with the model name. This loads the model weights and tokenizer for inference or fine-tuning.

PREREQUISITES

  • Python 3.8+
  • pip install transformers
  • pip install torch (or tensorflow)
  • Internet connection to download model weights

Setup

Install the transformers library and a backend deep learning framework like torch or tensorflow. Set up your Python environment with these commands:

bash
pip install transformers torch

Step by step

Load a pretrained model and tokenizer from Hugging Face by specifying the model name. This example uses bert-base-uncased for demonstration.

python
from transformers import AutoModel, AutoTokenizer

# Load tokenizer and model
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Tokenize input text
inputs = tokenizer("Hello, Hugging Face!", return_tensors="pt")

# Forward pass
outputs = model(**inputs)

print(outputs.last_hidden_state.shape)  # torch.Size([1, 5, 768])
output
torch.Size([1, 5, 768])

Common variations

  • Use AutoModelForSequenceClassification to load models fine-tuned for classification tasks.
  • Switch backend to TensorFlow by installing tensorflow and passing from_pt=True if needed.
  • Load models from local directories by passing the path instead of a model name.
python
from transformers import AutoModelForSequenceClassification

# Load a classification model
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

Troubleshooting

  • If you see a Model name not found error, verify the model name spelling or check availability on Hugging Face Hub.
  • For slow downloads, consider caching models locally by specifying cache_dir in from_pretrained().
  • If you get CUDA errors, ensure your PyTorch installation matches your GPU setup.

Key Takeaways

  • Use the transformers library's Auto classes to load pretrained models and tokenizers easily.
  • Specify the exact model name from Hugging Face Hub to download weights automatically.
  • You can load models for different tasks by choosing the appropriate AutoModel variant.
  • Local caching and backend framework choice help optimize loading and inference.
  • Check model availability and environment compatibility to avoid common errors.
Verified 2026-04 · bert-base-uncased, distilbert-base-uncased-finetuned-sst-2-english
Verify ↗