OSError / FileNotFoundError
OSError: Can't load model. Model {model_name} not found on huggingface.co
Stack trace
OSError: Can't load model. Model BAAI/bge-reranker-v2-m3 not found on huggingface.co.
Please check the model name, model ID, and the HuggingFace Hub website.
Traceback (most recent call last):
File "/path/to/site-packages/transformers/utils/hub.py", line 456, in _raise_if_offline_mode_is_enabled
File "/path/to/site-packages/huggingface_hub/utils/_headers.py", line 94, in get_session
File "/path/to/site-packages/huggingface_hub/_snapshot_download.py", line 563, in snapshot_download
raise OSError(f"Model {model_id} not found on huggingface.co") Why it happens
BGE reranker models must be downloaded from HuggingFace Hub at runtime. The error occurs when: (1) the model name has a typo or incorrect namespace (e.g., BAAI/bge-reranker-v2-m3 doesn't exist), (2) the model is in a private HuggingFace repo and your HF_TOKEN env var is missing or invalid, (3) you're offline or behind a firewall blocking huggingface.co, or (4) the model has been deleted/archived. FlagEmbedding and sentence-transformers use transformers.AutoModel under the hood, which performs live model resolution against the Hub.
Detection
Wrap model loading in try/except OSError and check HuggingFace Hub directly before deploying. Log the exact model_name being requested and verify it exists at huggingface.co/{model_name}. Use HF_HOME env var to track download cache location.
Causes & fixes
Model name typo or incorrect namespace (e.g., BAAI/bge-reranker-v2-m3 when correct name is BAAI/bge-reranker-v2-m3-en or similar variant)
Verify the exact model name on huggingface.co/BAAI. Use FlagEmbedding.get_reranker() with a known-good model ID: FlagEmbedding/bge-reranker-v2-m3-en or check https://huggingface.co/BAAI and copy the exact repo name. Add fallback to a base model: try your model, except OSError use 'BAAI/bge-reranker-base'.
Private HuggingFace repo requires authentication token but HF_TOKEN is not set or invalid
Generate a HuggingFace API token at huggingface.co/settings/tokens (read access minimum). Set HF_TOKEN env var: export HF_TOKEN='hf_xxxxxxxxxxxx'. Verify with: python -c 'from huggingface_hub import whoami; print(whoami())'. Then retry model load.
No internet connection or firewall blocks huggingface.co during model download
Pre-download model on a machine with internet: python -c 'from FlagEmbedding import FlagReranker; FlagReranker("BAAI/bge-reranker-v2-m3")'. Copy entire ~/.cache/huggingface/hub directory to your offline machine. Set HF_HOME=/path/to/cache before model load.
Model has been deleted, archived, or moved on HuggingFace Hub
Check if model exists: curl -I https://huggingface.co/api/models/BAAI/bge-reranker-v2-m3. If 404, the model is gone. Use an archived snapshot via revision parameter: FlagReranker(model_name_or_path='BAAI/bge-reranker-v2-m3', revision='refs/convert/pytorch'). Or switch to a maintained model: 'BAAI/bge-reranker-large' or 'cross-encoder/ms-marco-MiniLM-L-12-v2'.
Code: broken vs fixed
import os
from FlagEmbedding import FlagReranker
# BROKEN: Model name typo + no error handling
reranker = FlagReranker('BAAI/bge-reranker-v2-m3') # ← This model doesn't exist (missing -en suffix)
pairs = [["query", "passage1"], ["query", "passage2"]]
scores = reranker.compute_score(pairs)
print(scores) import os
from FlagEmbedding import FlagReranker
from huggingface_hub import login
# Set HuggingFace token if repo is private (optional)
if hf_token := os.environ.get('HF_TOKEN'):
login(token=hf_token)
# FIXED: Verify model name, add error handling, use fallback
model_names = [
'BAAI/bge-reranker-v2-m3', # Primary model (if this doesn't exist, check HF Hub)
'BAAI/bge-reranker-v2-m3-en', # Corrected name for English variant
'BAAI/bge-reranker-large' # Fallback model
]
reranker = None
for model_name in model_names:
try:
print(f"Loading reranker: {model_name}")
reranker = FlagReranker(
model_name_or_path=model_name,
use_fp16=True,
cache_dir=os.environ.get('HF_HOME', os.path.expanduser('~/.cache/huggingface'))
)
print(f"✓ Successfully loaded: {model_name}")
break
except OSError as e:
print(f"✗ Model not found: {model_name}\nError: {e}")
continue
if reranker is None:
raise RuntimeError("Could not load any reranker model. Check HF_TOKEN and model names.")
pairs = [["query", "passage1"], ["query", "passage2"]]
scores = reranker.compute_score(pairs)
print(f"Reranker scores: {scores}") Workaround
If you can't fix model loading immediately, use a local cross-encoder: download the ONNX version of bge-reranker from huggingface.co/Xenova (quantized, no transformers needed), or use sentence-transformers' cached local path: FlagReranker.from_pretrained('/local/path/to/model'). Extract model files manually via git lfs clone https://huggingface.co/BAAI/bge-reranker-v2-m3 and point cache_dir to that path.
Prevention
At deployment: (1) pre-download all reranker models in your Docker build layer to /app/models, (2) set HF_HOME=/app/models and HF_HUB_OFFLINE=1 at runtime to force local-only loading, (3) version-pin reranker models in a config file with fallback chains, (4) use huggingface_hub.model_info() at startup to validate model accessibility before serving requests.