High severity intermediate · Fix: 5-10 min

OSError / FileNotFoundError

OSError: Can't load model. Model {model_name} not found on huggingface.co

What this error means

BGE reranker fails to load a cross-encoder model from HuggingFace Hub because the model name is incorrect, not downloaded locally, or requires authentication for private repos.

Stack trace

traceback

OSError: Can't load model. Model BAAI/bge-reranker-v2-m3 not found on huggingface.co.

Please check the model name, model ID, and the HuggingFace Hub website.

Traceback (most recent call last):
  File "/path/to/site-packages/transformers/utils/hub.py", line 456, in _raise_if_offline_mode_is_enabled
  File "/path/to/site-packages/huggingface_hub/utils/_headers.py", line 94, in get_session
  File "/path/to/site-packages/huggingface_hub/_snapshot_download.py", line 563, in snapshot_download
    raise OSError(f"Model {model_id} not found on huggingface.co")

QUICK FIX

Add try/except OSError around model load, log the model_name, verify it exists on huggingface.co, and set HF_TOKEN env var if the repo is private.

Why it happens

BGE reranker models must be downloaded from HuggingFace Hub at runtime. The error occurs when: (1) the model name has a typo or incorrect namespace (e.g., BAAI/bge-reranker-v2-m3 doesn't exist), (2) the model is in a private HuggingFace repo and your HF_TOKEN env var is missing or invalid, (3) you're offline or behind a firewall blocking huggingface.co, or (4) the model has been deleted/archived. FlagEmbedding and sentence-transformers use transformers.AutoModel under the hood, which performs live model resolution against the Hub.

Detection

Wrap model loading in try/except OSError and check HuggingFace Hub directly before deploying. Log the exact model_name being requested and verify it exists at huggingface.co/{model_name}. Use HF_HOME env var to track download cache location.

Causes & fixes

Model name typo or incorrect namespace (e.g., BAAI/bge-reranker-v2-m3 when correct name is BAAI/bge-reranker-v2-m3-en or similar variant)

✓ Fix

Verify the exact model name on huggingface.co/BAAI. Use FlagEmbedding.get_reranker() with a known-good model ID: FlagEmbedding/bge-reranker-v2-m3-en or check https://huggingface.co/BAAI and copy the exact repo name. Add fallback to a base model: try your model, except OSError use 'BAAI/bge-reranker-base'.

Private HuggingFace repo requires authentication token but HF_TOKEN is not set or invalid

✓ Fix

Generate a HuggingFace API token at huggingface.co/settings/tokens (read access minimum). Set HF_TOKEN env var: export HF_TOKEN='hf_xxxxxxxxxxxx'. Verify with: python -c 'from huggingface_hub import whoami; print(whoami())'. Then retry model load.

No internet connection or firewall blocks huggingface.co during model download

✓ Fix

Pre-download model on a machine with internet: python -c 'from FlagEmbedding import FlagReranker; FlagReranker("BAAI/bge-reranker-v2-m3")'. Copy entire ~/.cache/huggingface/hub directory to your offline machine. Set HF_HOME=/path/to/cache before model load.

Model has been deleted, archived, or moved on HuggingFace Hub

✓ Fix

Check if model exists: curl -I https://huggingface.co/api/models/BAAI/bge-reranker-v2-m3. If 404, the model is gone. Use an archived snapshot via revision parameter: FlagReranker(model_name_or_path='BAAI/bge-reranker-v2-m3', revision='refs/convert/pytorch'). Or switch to a maintained model: 'BAAI/bge-reranker-large' or 'cross-encoder/ms-marco-MiniLM-L-12-v2'.

Code: broken vs fixed

Broken - triggers the error

python

import os
from FlagEmbedding import FlagReranker

# BROKEN: Model name typo + no error handling
reranker = FlagReranker('BAAI/bge-reranker-v2-m3')  # ← This model doesn't exist (missing -en suffix)
pairs = [["query", "passage1"], ["query", "passage2"]]
scores = reranker.compute_score(pairs)
print(scores)

Fixed - works correctly

python

import os
from FlagEmbedding import FlagReranker
from huggingface_hub import login

# Set HuggingFace token if repo is private (optional)
if hf_token := os.environ.get('HF_TOKEN'):
    login(token=hf_token)

# FIXED: Verify model name, add error handling, use fallback
model_names = [
    'BAAI/bge-reranker-v2-m3',  # Primary model (if this doesn't exist, check HF Hub)
    'BAAI/bge-reranker-v2-m3-en',  # Corrected name for English variant
    'BAAI/bge-reranker-large'  # Fallback model
]

reranker = None
for model_name in model_names:
    try:
        print(f"Loading reranker: {model_name}")
        reranker = FlagReranker(
            model_name_or_path=model_name,
            use_fp16=True,
            cache_dir=os.environ.get('HF_HOME', os.path.expanduser('~/.cache/huggingface'))
        )
        print(f"✓ Successfully loaded: {model_name}")
        break
    except OSError as e:
        print(f"✗ Model not found: {model_name}\nError: {e}")
        continue

if reranker is None:
    raise RuntimeError("Could not load any reranker model. Check HF_TOKEN and model names.")

pairs = [["query", "passage1"], ["query", "passage2"]]
scores = reranker.compute_score(pairs)
print(f"Reranker scores: {scores}")

Added HF_TOKEN login for private repos, explicit error handling with fallback model chain, environment variable for cache location, and print statements to identify which model fails and which succeeds.

⚠

Workaround

If you can't fix model loading immediately, use a local cross-encoder: download the ONNX version of bge-reranker from huggingface.co/Xenova (quantized, no transformers needed), or use sentence-transformers' cached local path: FlagReranker.from_pretrained('/local/path/to/model'). Extract model files manually via git lfs clone https://huggingface.co/BAAI/bge-reranker-v2-m3 and point cache_dir to that path.

✓

Prevention

At deployment: (1) pre-download all reranker models in your Docker build layer to /app/models, (2) set HF_HOME=/app/models and HF_HUB_OFFLINE=1 at runtime to force local-only loading, (3) version-pin reranker models in a config file with fallback chains, (4) use huggingface_hub.model_info() at startup to validate model accessibility before serving requests.

Python 3.9+ · FlagEmbedding >=1.2.0 · tested on 1.3.x

Verified 2026-04 · BAAI/bge-reranker-v2-m3, BAAI/bge-reranker-large, cross-encoder/ms-marco-MiniLM-L-12-v2

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.