How to intermediate · 3 min read

How to train LoRA adapter

Q: How to train LoRA adapter

To train a LoRA adapter, use the PEFT library with a pretrained base model from transformers. Load the model, configure LoRA parameters via LoraConfig, wrap the model with get_peft_model, then fine-tune using a trainer like transformers.Trainer on your dataset.

Quick answer

To train a LoRA adapter, use the PEFT library with a pretrained base model from transformers. Load the model, configure LoRA parameters via LoraConfig, wrap the model with get_peft_model, then fine-tune using a trainer like transformers.Trainer on your dataset.

PREREQUISITES

Python 3.8+
pip install transformers peft datasets torch
Access to a pretrained Hugging Face model (e.g., meta-llama/Llama-3.1-8B-Instruct)
Basic knowledge of PyTorch and Hugging Face Trainer

Setup

Install required packages and import necessary modules for LoRA training.

bash

pip install transformers peft datasets torch

Step by step

This example shows how to train a LoRA adapter on a text dataset using Hugging Face transformers and peft. It loads a pretrained model, applies LoRA, prepares a dataset, and fine-tunes the adapter.

python

import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model, TaskType
from datasets import load_dataset

# Load pretrained model and tokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.float16)

# Configure LoRA
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    task_type=TaskType.CAUSAL_LM
)

# Wrap model with LoRA
model = get_peft_model(model, lora_config)

# Load a small dataset for fine-tuning
dataset = load_dataset("wikitext", "wikitext-2-raw-v1", split="train[:1%]")

# Tokenize function
 def tokenize_function(examples):
    return tokenizer(examples["text"], truncation=True, max_length=512)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

# Training arguments
training_args = TrainingArguments(
    output_dir="./lora-llama",
    per_device_train_batch_size=4,
    num_train_epochs=1,
    logging_steps=10,
    save_steps=10,
    save_total_limit=1,
    fp16=True,
    evaluation_strategy="no"
)

# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset
)

# Train LoRA adapter
trainer.train()

# Save LoRA adapter
model.save_pretrained("./lora-llama-adapter")

output

***** Running training *****
  Num examples = 288
  Num Epochs = 1
  Instantaneous batch size per device = 4
  Total train batch size (w. parallel, distributed & accumulation) = 4
  Gradient Accumulation steps = 1
  Total optimization steps = 72
...
Training completed. Model saved to ./lora-llama-adapter

Common variations

Use BitsAndBytesConfig to combine LoRA with 4-bit quantization (QLoRA) for memory efficiency.
Train asynchronously with PyTorch Lightning or accelerate for distributed setups.
Apply LoRA to different base models by changing model_name and adjusting target_modules.

Troubleshooting

If you get CUDA out-of-memory errors, reduce batch size or enable gradient checkpointing.
Ensure target_modules match the model architecture; otherwise, LoRA won't apply correctly.
Check that transformers, peft, and datasets versions are compatible.

✅

Key Takeaways

Use peft with LoraConfig and get_peft_model to efficiently fine-tune large models.
Combine LoRA with 4-bit quantization (QLoRA) for memory-efficient training on limited hardware.
Always match target_modules to your model's architecture to ensure LoRA layers are applied correctly.

Verified 2026-04 · meta-llama/Llama-3.1-8B-Instruct

Verify ↗