How to Intermediate · 3 min read

How to load LoRA adapter

Quick answer
To load a LoRA adapter, use the PEFT library's LoraConfig and get_peft_model functions with a pretrained Hugging Face model. Initialize your base model, configure LoRA parameters, then wrap the model with get_peft_model to apply the adapter.

PREREQUISITES

  • Python 3.8+
  • pip install transformers peft torch
  • Hugging Face pretrained model checkpoint

Setup

Install the required libraries: transformers for model loading, peft for LoRA support, and torch for PyTorch backend.

bash
pip install transformers peft torch

Step by step

This example loads a pretrained Hugging Face causal language model and applies a LoRA adapter with typical configuration parameters.

python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
import torch

# Load base model and tokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

# Configure LoRA adapter
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    task_type=TaskType.CAUSAL_LM
)

# Load LoRA adapter onto the base model
model = get_peft_model(model, lora_config)

# Model is now ready with LoRA adapter loaded
print("LoRA adapter loaded successfully")
output
LoRA adapter loaded successfully

Common variations

  • Use BitsAndBytesConfig to load the base model in 4-bit precision for QLoRA.
  • Adjust target_modules depending on model architecture (e.g., different projection layers).
  • For causal language models, set task_type=TaskType.CAUSAL_LM; for sequence classification, use TaskType.SEQ_CLS.
  • LoRA adapters can be saved and loaded separately using model.save_pretrained() and get_peft_model() with peft_model_id.

Troubleshooting

  • If you get CUDA out-of-memory errors, try loading the model with 4-bit quantization using BitsAndBytesConfig(load_in_4bit=True).
  • Ensure target_modules matches your model's layer names; otherwise, LoRA won't apply correctly.
  • Verify that peft and transformers versions are compatible and up to date.
  • If loading a LoRA adapter checkpoint, use get_peft_model(model, peft_model_id="path_or_hub_id") to load weights.

Key Takeaways

  • Use PEFT's LoraConfig and get_peft_model to load LoRA adapters onto Hugging Face models.
  • Configure target_modules and task_type correctly for your model architecture and task.
  • Combine LoRA with 4-bit quantization (QLoRA) for efficient fine-tuning on limited hardware.
  • Save and load LoRA adapters separately from base models for modular fine-tuning workflows.
  • Keep peft and transformers libraries updated to avoid compatibility issues.
Verified 2026-04 · meta-llama/Llama-3.1-8B-Instruct
Verify ↗