ValueError: device_map='auto' requires accelerate library
ValueError: device_map='auto' requires the accelerate library (https://huggingface.co/docs/accelerate)
Stack trace
Traceback (most recent call last):
File "load_model.py", line 8, in <module>
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", device_map="auto")
File "/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1234, in from_pretrained
raise ValueError(
"device_map='auto' requires the accelerate library ("
"https://huggingface.co/docs/accelerate). Please install it with "
"`pip install accelerate`."
)
ValueError: device_map='auto' requires the accelerate library (https://huggingface.co/docs/accelerate). Please install it with `pip install accelerate`. Why it happens
The transformers library requires the accelerate library to use device_map='auto', which intelligently distributes model layers across available GPUs, CPUs, and disk storage to optimize memory usage. Without accelerate installed, transformers cannot compute optimal device placement and raises this error. This is a hard dependency when using device_map='auto' with large models like Llama 3.3 70B that don't fit in single GPU memory.
Detection
Check your requirements.txt or pip list before loading large Llama models with device_map='auto'. Monitor your Python environment in CI/CD pipelines to ensure accelerate is listed as an explicit dependency.
Causes & fixes
accelerate library is not installed in the Python environment
Install accelerate: `pip install accelerate` or add 'accelerate>=0.24.0' to your requirements.txt and reinstall
Using device_map='auto' without needing it (single GPU with enough VRAM)
Remove device_map='auto' entirely and use device_map=None (default) or device='cuda:0' for simpler, single-GPU loading
Outdated transformers version that doesn't support device_map parameter correctly
Upgrade transformers: `pip install --upgrade transformers>=4.30.0` to ensure accelerate integration is stable
Virtual environment isolation issue: accelerate installed globally but not in current venv
Verify you're using the correct Python interpreter: `which python` and reinstall accelerate in the active venv: `pip install accelerate`
Code: broken vs fixed
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
model_id = "meta-llama/Llama-2-7b-hf"
token = os.environ.get("HF_TOKEN")
# This line fails without accelerate installed:
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto", # ❌ Requires accelerate — will crash here
token=token
)
tokenizer = AutoTokenizer.from_pretrained(model_id, token=token) from transformers import AutoModelForCausalLM, AutoTokenizer
import os
# First, ensure accelerate is installed: pip install accelerate
model_id = "meta-llama/Llama-2-7b-hf"
token = os.environ.get("HF_TOKEN")
# ✅ Option 1: Install accelerate and use device_map='auto' for multi-GPU or large models
try:
import accelerate
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto", # ✅ Now works — accelerate is installed
token=token,
torch_dtype="auto"
)
print("Model loaded with device_map='auto' using accelerate")
except ImportError:
print("Error: accelerate not installed. Run: pip install accelerate")
# ✅ Option 2: Fall back to single GPU if accelerate unavailable
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda:0", # ✅ Single GPU without accelerate dependency
token=token,
torch_dtype="auto"
)
print("Model loaded on cuda:0 (accelerate not available)")
tokenizer = AutoTokenizer.from_pretrained(model_id, token=token) Workaround
If you cannot install accelerate immediately, remove device_map='auto' and use device_map=None (CPU/single GPU) or device='cuda:0' for single-GPU inference. This trades memory optimization for availability. For production, use a containerized environment (Docker) with accelerate pre-installed to guarantee dependency consistency across deployments.
Prevention
Pin accelerate>=0.24.0 in your requirements.txt or setup.py before deploying Llama models. In CI/CD, add a pre-deployment check: `python -c 'import accelerate'` to verify the dependency exists. For Docker deployments, include `RUN pip install transformers accelerate` in the Dockerfile to guarantee both libraries are present. Use environment markers in requirements.txt: `accelerate>=0.24.0; python_version>='3.8'` for version-specific control.