High severity intermediate · Fix: 5-10 min

RuntimeError

torch.nn.modules.module.RuntimeError

What this error means
This error occurs when the loaded PEFT adapter weights do not match the expected model parameter dimensions during Lora or Qlora fine-tuning or inference.

Stack trace

traceback
RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 1
  File "/usr/local/lib/python3.9/site-packages/peft/utils/other.py", line 123, in load_adapter
    model.load_state_dict(adapter_state_dict, strict=False)
  File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in load_state_dict
    raise RuntimeError(f"Error(s) in loading state_dict for {self.__class__.__name__}:\n" + error_msgs)
QUICK FIX
Verify and match the base model architecture and PEFT config parameters exactly to the adapter weights before loading.

Why it happens

PEFT adapters like Lora expect the adapter weights to have specific tensor dimensions matching the base model's layers. If the adapter weights were trained or saved with a different model architecture or configuration (e.g., different hidden size or rank), loading them causes dimension mismatch errors.

Detection

Check for RuntimeError during adapter weight loading mentioning mismatched tensor sizes; log the shapes of model parameters and adapter weights before loading to detect dimension conflicts early.

Causes & fixes

1

Adapter weights were trained on a model with different hidden size or layer dimensions than the current base model.

✓ Fix

Ensure the base model architecture matches exactly the one used to train the adapter weights, including hidden size, number of layers, and attention heads.

2

Using incompatible PEFT configuration parameters such as different LoRA rank (r) or alpha values between training and loading.

✓ Fix

Match the PEFT configuration parameters (rank, alpha, dropout) exactly when loading the adapter weights as used during training.

3

Loading adapter weights saved from a different model checkpoint or variant (e.g., loading weights from a 7B model into a 13B model).

✓ Fix

Always load adapter weights from the same model variant and checkpoint size to avoid dimension mismatches.

Code: broken vs fixed

Broken - triggers the error
python
from peft import PeftModel
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained('model-name')
adapter_path = 'adapter/checkpoint'
peft_model = PeftModel.from_pretrained(model, adapter_path)  # RuntimeError: dimension mismatch here
Fixed - works correctly
python
import os
from peft import PeftModel
from transformers import AutoModelForCausalLM

os.environ['HF_HOME'] = '/path/to/huggingface/cache'

model = AutoModelForCausalLM.from_pretrained('model-name')
# Ensure model-name matches adapter training base model
adapter_path = 'adapter/checkpoint'
# Load with matching PEFT config
peft_model = PeftModel.from_pretrained(model, adapter_path)
print('Adapter loaded successfully with matching dimensions')
Ensured the base model matches the adapter training model exactly and loaded the adapter weights with matching PEFT config to fix dimension mismatch.

Workaround

Catch the RuntimeError during adapter loading, then manually inspect and reshape or reinitialize adapter weights to match the base model dimensions before retrying load.

Prevention

Maintain strict version control and documentation of base model architectures and PEFT config parameters used during adapter training to guarantee compatibility when loading weights.

Python 3.9+ · peft >=0.4.0 · tested on 0.4.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.