GatedRepoError
huggingface_hub.utils._errors.GatedRepoError
Stack trace
huggingface_hub.utils._errors.GatedRepoError: Access to model meta-llama/Llama-2-7b-hf is restricted. You must have access to this repo. Visit https://huggingface.co/meta-llama/Llama-2-7b-hf to accept the license and get access.
Why it happens
Meta Llama models (Llama 2, 3.1, 3.2, 3.3) are hosted on HuggingFace behind gated repositories that require you to: (1) have a HuggingFace account, (2) visit the model page and explicitly accept the model's license terms, and (3) authenticate your local environment with a valid HuggingFace API token. Without all three, HuggingFace returns a 403 Forbidden error. This is a legal/licensing requirement, not a bug.
Detection
Before downloading a Llama model, check if the HuggingFace token is set in your environment (huggingface_hub.get_token() should return a valid token) and verify you've accepted the license on the HuggingFace model page. Add logging before model initialization to catch missing tokens early.
Causes & fixes
No HuggingFace token provided or token not found in environment
Set HF_TOKEN environment variable with your HuggingFace API token, or pass it explicitly: from huggingface_hub import login; login(token='hf_...'). Get a token at https://huggingface.co/settings/tokens
Token is valid but you haven't accepted the model's license on HuggingFace.co
Visit https://huggingface.co/meta-llama/Llama-3-2-3b-instruct (or the specific model URL) and click 'Agree and access repository' to accept the license terms
Using an old or revoked HuggingFace token
Generate a new token at https://huggingface.co/settings/tokens (copy the full token starting with 'hf_'), then update HF_TOKEN environment variable and re-login
Model name is misspelled or refers to a non-existent/private repository
Verify the exact model ID on HuggingFace.co (e.g., 'meta-llama/Llama-3-2-3b-instruct' not 'llama-3-2-3b'). Copy the model name directly from the HuggingFace URL
Code: broken vs fixed
import os
from transformers import AutoTokenizer, AutoModelForCausalLM
# This fails with GatedRepoError because no token is provided and license not accepted
model_id = 'meta-llama/Llama-3-2-3b-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_id) # ← Line that fails
model = AutoModelForCausalLM.from_pretrained(model_id) import os
from transformers import AutoTokenizer, AutoModelForCausalLM
from huggingface_hub import login
# Step 1: Authenticate with HuggingFace token (get from https://huggingface.co/settings/tokens)
hf_token = os.environ.get('HF_TOKEN')
if hf_token:
login(token=hf_token) # ← Added: Log in with token
else:
raise ValueError('HF_TOKEN environment variable not set. Get a token at https://huggingface.co/settings/tokens')
# Step 2: Load model (after accepting license on https://huggingface.co/meta-llama/Llama-3-2-3b-instruct)
model_id = 'meta-llama/Llama-3-2-3b-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
print('Model loaded successfully!') Workaround
If you can't use HuggingFace tokens (e.g., in restricted environment), use Ollama to run Llama locally: ollama pull llama3.2:3b, then access via localhost:11434 with ollama Python client. This bypasses HuggingFace gating entirely and runs the model on your hardware.
Prevention
Adopt a three-step onboarding pattern for gated models: (1) Document which models require gating in your README, (2) Store HF_TOKEN in your .env file or secrets manager (never commit it), (3) Add a pre-flight check function that validates the token and model access before the main application loads. Use structured logging to surface token/license issues before they hit users.