High severity beginner · Fix: 2-5 min

ValueError: trust_remote_code=True required

transformers.modeling_utils.RepositoryNotFoundError / ValueError (trust_remote_code parameter)

What this error means

Qwen2-VL requires trust_remote_code=True when loading from HuggingFace because its custom vision encoder and image processing code is not in the standard transformers library.

Stack trace

traceback

Traceback (most recent call last):
  File "app.py", line 42, in <module>
    model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")
  File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 555, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3114, in from_pretrained
    raise ValueError(
    ValueError: Custom modeling code detected in repo Qwen/Qwen2-VL-7B-Instruct. Set `trust_remote_code=True` to use this model. For more information, see https://huggingface.co/docs/transformers/custom_code

QUICK FIX

Add trust_remote_code=True to both AutoModelForCausalLM.from_pretrained() and AutoProcessor.from_pretrained() calls for Qwen2-VL.

Why it happens

Qwen2-VL uses custom vision encoding logic (for image patch tokenization) and image processing functions that aren't part of the official transformers library distribution. The transformers library blocks loading custom modeling code by default as a security measure: arbitrary Python code from HuggingFace repos could contain malware. You must explicitly opt-in with trust_remote_code=True to acknowledge you've reviewed the repo and accept the risk.

Detection

Run your model loading code and catch ValueError for 'trust_remote_code' in the message. Enable debug logging via transformers.logging.set_verbosity_debug() to see when custom code paths are invoked during model initialization.

Causes & fixes

AutoModelForCausalLM.from_pretrained() called without trust_remote_code parameter

✓ Fix

Add trust_remote_code=True to from_pretrained() call: AutoModelForCausalLM.from_pretrained('Qwen/Qwen2-VL-7B-Instruct', trust_remote_code=True)

AutoProcessor.from_pretrained() also needs trust_remote_code because Qwen2-VL has custom image processing

✓ Fix

Add trust_remote_code=True to processor loading: processor = AutoProcessor.from_pretrained('Qwen/Qwen2-VL-7B-Instruct', trust_remote_code=True)

Using quantized versions (GPTQ, AWQ) which may have additional custom kernels

✓ Fix

For quantized models, enable both trust_remote_code=True and device_map='auto': model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen2-VL-7B-Instruct-GPTQ', trust_remote_code=True, device_map='auto')

Loading model config separately without trust_remote_code blocks downstream vision processing

✓ Fix

Pass trust_remote_code=True when loading config: config = AutoConfig.from_pretrained('Qwen/Qwen2-VL-7B-Instruct', trust_remote_code=True)

Code: broken vs fixed

Broken - triggers the error

python

from transformers import AutoModelForCausalLM, AutoProcessor
import os

model_id = "Qwen/Qwen2-VL-7B-Instruct"

# This line will fail with ValueError about trust_remote_code
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="auto"
)

processor = AutoProcessor.from_pretrained(model_id)

print("Model loaded successfully")

Fixed - works correctly

python

from transformers import AutoModelForCausalLM, AutoProcessor
import os
from PIL import Image
import requests

model_id = "Qwen/Qwen2-VL-7B-Instruct"

# FIX: Add trust_remote_code=True to both model and processor loading
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True  # Enable custom vision encoding logic
)

processor = AutoProcessor.from_pretrained(
    model_id,
    trust_remote_code=True  # Enable custom image processing
)

print("Model loaded successfully")

# Verify with a simple inference
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"
image = Image.open(requests.get(image_url, stream=True).raw)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What is in this image?"}
        ]
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = processor(text=[text], images=[image], padding=True, return_tensors="pt")
inputs = inputs.to("cuda")

with torch.no_grad():
    output_ids = model.generate(**inputs, max_new_tokens=128)

response = processor.decode(output_ids[0], skip_special_tokens=True)
print(f"Model response: {response}")

Added trust_remote_code=True to both from_pretrained() calls to permit loading of Qwen2-VL's custom vision encoder and image processor code from the HuggingFace repo, and added a complete working inference example to verify the fix.

⚠

Workaround

If you cannot update transformers and absolutely must load Qwen2-VL locally, manually download the modeling files from the HuggingFace repo (modeling_qwen2_vl.py, image_processing_qwen2_vl.py), save them in your local directory, and use AutoModelForCausalLM.from_pretrained('./local_qwen_path', local_files_only=True): but this is fragile and not recommended for production.

✓

Prevention

Always audit custom modeling code before enabling trust_remote_code=True: visit the HuggingFace repo, review modeling_*.py files, and confirm they don't contain suspicious code. For production deployments, use official quantized versions (GPTQ, AWQ) or switch to fully open-source multimodal models with standard transformers implementations like LLaVA-1.5 which don't require custom code.

Python 3.10+ · transformers >=4.38.0 · tested on 4.42.x

Verified 2026-04 · Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-32B-Instruct

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.