ValueError: chat_template
ValueError: Llama 2 chat template format is deprecated, use Llama 3 format
Stack trace
ValueError: Llama 2 chat template format is deprecated, use Llama 3 format instead. See https://huggingface.co/meta-llama/Llama-2-7b-chat/discussions/10 for migration guide.
File "transformers/models/llama/tokenizer_llama.py", line 412, in _build_chat_template
raise ValueError(msg)
The applied chat template:
{chat_template}
Does not match the expected Llama 2 format. Update to Llama 3 or set use_default_system_prompt=False. Why it happens
Llama 2's chat template used a specific format with `[INST]` and `[/INST]` markers that proved rigid and error-prone in production. Llama 3 introduced a more flexible chat format with proper role-based conversation structure. When you load a Llama 2 model with modern transformers (4.36+), the library detects the old template and refuses to use it, forcing you to either upgrade to Llama 3 or explicitly disable template validation. This is intentional: Llama 2 is now deprecated in favor of Llama 3.2 and Llama 3.3, which have superior instruction-following and chat capabilities.
Detection
Check your tokenizer logs or wrap model loading in try/except ValueError. You'll see the error immediately on `AutoTokenizer.from_pretrained()` or during the first `apply_chat_template()` call. Look for 'Llama 2 chat template' in the error message.
Causes & fixes
Loading a Llama 2 model (llama-2-7b-chat, llama-2-13b-chat) with transformers 4.36+ which enforces Llama 3 template format
Upgrade to a Llama 3 model: use llama-3.2-3b-instruct, llama-3.2-8b-instruct, or llama-3.3-70b-instruct instead. These are drop-in replacements with better performance.
Using Llama 2 model but calling apply_chat_template() without setting use_default_system_prompt=False, triggering format validation
Set use_default_system_prompt=False in apply_chat_template(): tokenizer.apply_chat_template(messages, use_default_system_prompt=False, tokenize=False)
Custom chat template in tokenizer_config.json that follows Llama 2 format, but transformers expects Llama 3 format
Update your tokenizer_config.json to use Llama 3 format: https://huggingface.co/meta-llama/Llama-3.2-8B-Instruct/blob/main/tokenizer_config.json: or use a Llama 3 model directly.
Manually constructing Llama 2 prompt format (using [INST] markers) instead of using apply_chat_template()
Switch to tokenizer.apply_chat_template(messages) with Llama 3 messages format: [{'role': 'user', 'content': '...'}, {'role': 'assistant', 'content': '...'}]
Code: broken vs fixed
import os
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = 'meta-llama/Llama-2-7b-chat' # ❌ Deprecated — triggers ValueError
tokenizer = AutoTokenizer.from_pretrained(model_name, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_name, token=os.environ['HF_TOKEN'])
messages = [
{'role': 'user', 'content': 'What is Python?'}
]
# This line raises: ValueError: Llama 2 chat template format is deprecated, use Llama 3 format
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
print(prompt) import os
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = 'meta-llama/Llama-3.2-8B-Instruct' # ✅ Fixed — Llama 3 model, no deprecation
tokenizer = AutoTokenizer.from_pretrained(model_name, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_name, token=os.environ['HF_TOKEN'])
messages = [
{'role': 'user', 'content': 'What is Python?'}
]
# Works perfectly — Llama 3 format is fully supported
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
print(prompt)
# Output: <|start_header_id|>user<|end_header_id|>\n\nWhat is Python?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n Workaround
If you must use Llama 2 temporarily: set use_default_system_prompt=False in apply_chat_template() to skip template validation: `tokenizer.apply_chat_template(messages, use_default_system_prompt=False, tokenize=False)`. However, this bypasses format checks and may cause LLM quality issues. Migrate to Llama 3 as soon as possible.
Prevention
Always target Llama 3.2 or Llama 3.3 for new projects: they are production-ready, better instruction-following, and fully supported by transformers. Llama 2 reached end-of-life in 2024. Use model version pinning in your requirements: `transformers==4.40.0` paired with `meta-llama/Llama-3.2-8B-Instruct`. Set up model versioning tests to catch deprecation warnings before production deployment.