DeprecationWarning / ModelNotFoundError
transformers.utils.generic.ModelNotFoundError / DeprecationWarning (Qwen1/1.5 EOL)
Stack trace
transformers.utils.generic.ModelNotFoundError: qwen1-7b is not a valid model identifier. Check the model name passed to from_pretrained() or verify it exists on Hugging Face Model Hub. Alternatively, when loading from Ollama: Error: model 'qwen:1.5' not found locally and not available in Ollama library. Use 'ollama pull qwen2.5:7b' instead. DeprecationWarning: qwen-1.5 model identifier is deprecated and will be removed in a future version. Migrate to qwen2.5-7b or newer.
Why it happens
Qwen1 and Qwen1.5 models reached their end-of-support date in Q4 2024. Alibaba Cloud removed these model versions from the Hugging Face Model Hub and Ollama registries to consolidate on the Qwen2.5 series, which offers significantly better instruction-following, reasoning, and multilingual capabilities. When you attempt to load qwen1-7b, qwen1-14b, qwen1.5-7b, or qwen1.5-32b from transformers or Ollama, the model host returns 404 or raises a DeprecationWarning, blocking your pipeline.
Detection
Check your code for hardcoded model IDs like 'qwen1-7b' or 'qwen:1.5'. Add a pre-flight validation that queries the Hugging Face Model Hub API or Ollama's model list to confirm your model identifier exists before attempting to load it. Log any deprecation warnings at startup.
Causes & fixes
Loading Qwen1 or Qwen1.5 from HuggingFace via model_id string (e.g., 'Qwen/qwen-1.5-7b-chat'): model no longer exists on Hub
Replace your model_id with a Qwen2.5 equivalent: 'Qwen/qwen2.5-7b-instruct' or 'Qwen/qwen2.5-14b-instruct'. Visit https://huggingface.co/Qwen to see current model listings.
Using Ollama with deprecated qwen:1.5 or qwen:1 tags in ollama pull or client.generate(model='qwen:1.5')
Update to 'qwen2.5:7b', 'qwen2.5:14b', or 'qwen2.5:32b' in your Ollama commands and code. Run 'ollama pull qwen2.5:7b' to download the new model.
LangChain or LiteLLM configuration files or environment variables still reference old Qwen1/1.5 model names
Search your codebase, .env files, and config YAMLs for 'qwen-1', 'qwen1-', 'qwen:1', or 'qwen:1.5' and replace with 'qwen2.5-7b-instruct' or equivalent. Reload environment variables.
Pinecone, vLLM, or other inference server configured with old model checkpoints from pre-2025 snapshots
Update your model serving configuration to point to the latest Qwen2.5 model checkpoint on HuggingFace. For vLLM: use '--model Qwen/qwen2.5-7b-instruct' instead of '--model Qwen/qwen-1.5-7b-chat'.
Code: broken vs fixed
import os
from transformers import AutoTokenizer, AutoModelForCausalLM
# ❌ BROKEN: qwen1.5 model no longer exists on HuggingFace
model_id = 'Qwen/qwen-1.5-7b-chat' # ← This will raise ModelNotFoundError
token = os.environ.get('HUGGING_FACE_TOKEN')
tokenizer = AutoTokenizer.from_pretrained(model_id, token=token)
model = AutoModelForCausalLM.from_pretrained(model_id, token=token, device_map='auto')
print('Model loaded:', model_id) import os
from transformers import AutoTokenizer, AutoModelForCausalLM
# ✅ FIXED: Use Qwen2.5 series (available as of 2025)
model_id = 'Qwen/qwen2.5-7b-instruct' # ← Migrated from qwen-1.5-7b-chat
token = os.environ.get('HUGGING_FACE_TOKEN')
tokenizer = AutoTokenizer.from_pretrained(model_id, token=token)
model = AutoModelForCausalLM.from_pretrained(model_id, token=token, device_map='auto')
print(f'Model loaded successfully: {model_id}') Workaround
If you absolutely cannot migrate yet (rare), you can use an older cached copy of Qwen1.5 from a local directory via AutoModel.from_pretrained(path='/local/qwen-1.5-7b-chat'), but this is unsupported and will break when you upgrade transformers. A better interim fix is to use an older Ollama snapshot that still ships Qwen1.5 ('ollama run qwen:1.5'), but expect this to stop working within months. The only real solution is immediate migration to Qwen2.5.
Prevention
Adopt a model versioning strategy: pin your model IDs in code and configuration with the full HuggingFace namespace (e.g., 'Qwen/qwen2.5-7b-instruct'), monitor the official Qwen GitHub releases and HuggingFace model card for deprecation notices, and run quarterly audits of your model IDs against the live Model Hub using huggingface_hub.model_info() to catch deprecations before they hit production.