How to load LoRA adapter
Quick answer
To load a LoRA adapter, use the PEFT library's
LoraConfig and get_peft_model functions with a pretrained Hugging Face model. Initialize your base model, configure LoRA parameters, then wrap the model with get_peft_model to apply the adapter.PREREQUISITES
Python 3.8+pip install transformers peft torchHugging Face pretrained model checkpoint
Setup
Install the required libraries: transformers for model loading, peft for LoRA support, and torch for PyTorch backend.
pip install transformers peft torch Step by step
This example loads a pretrained Hugging Face causal language model and applies a LoRA adapter with typical configuration parameters.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
import torch
# Load base model and tokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
# Configure LoRA adapter
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
task_type=TaskType.CAUSAL_LM
)
# Load LoRA adapter onto the base model
model = get_peft_model(model, lora_config)
# Model is now ready with LoRA adapter loaded
print("LoRA adapter loaded successfully") output
LoRA adapter loaded successfully
Common variations
- Use
BitsAndBytesConfigto load the base model in 4-bit precision for QLoRA. - Adjust
target_modulesdepending on model architecture (e.g., different projection layers). - For causal language models, set
task_type=TaskType.CAUSAL_LM; for sequence classification, useTaskType.SEQ_CLS. - LoRA adapters can be saved and loaded separately using
model.save_pretrained()andget_peft_model()withpeft_model_id.
Troubleshooting
- If you get CUDA out-of-memory errors, try loading the model with 4-bit quantization using
BitsAndBytesConfig(load_in_4bit=True). - Ensure
target_modulesmatches your model's layer names; otherwise, LoRA won't apply correctly. - Verify that
peftandtransformersversions are compatible and up to date. - If loading a LoRA adapter checkpoint, use
get_peft_model(model, peft_model_id="path_or_hub_id")to load weights.
Key Takeaways
- Use PEFT's
LoraConfigandget_peft_modelto load LoRA adapters onto Hugging Face models. - Configure
target_modulesandtask_typecorrectly for your model architecture and task. - Combine LoRA with 4-bit quantization (QLoRA) for efficient fine-tuning on limited hardware.
- Save and load LoRA adapters separately from base models for modular fine-tuning workflows.
- Keep
peftandtransformerslibraries updated to avoid compatibility issues.