How to Intermediate · 4 min read

LoRA for domain adaptation

Quick answer
Use LoRA (Low-Rank Adaptation) to efficiently fine-tune large language models for domain adaptation by injecting trainable low-rank matrices into model weights, drastically reducing parameters to update. This enables fast, resource-light customization of LLMs to new domains without full retraining.

PREREQUISITES

  • Python 3.8+
  • pip install transformers>=4.30.0
  • pip install peft>=0.4.0
  • pip install datasets
  • Access to a pretrained LLM checkpoint (e.g., Hugging Face model)

Setup

Install the necessary Python packages for LoRA fine-tuning and dataset handling. Ensure you have a pretrained model checkpoint ready for adaptation.

bash
pip install transformers peft datasets

Step by step

This example shows how to apply LoRA for domain adaptation on a pretrained causal language model using the peft library. It fine-tunes only the low-rank adapters on a small domain-specific dataset.

python
import os
from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model
from datasets import load_dataset

# Load pretrained model and tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare LoRA configuration
lora_config = LoraConfig(
    r=8,  # rank of LoRA matrices
    lora_alpha=16,
    target_modules=["c_attn"],  # GPT2 attention projection layers
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)

# Wrap model with LoRA adapters
model = get_peft_model(model, lora_config)

# Load a small domain-specific dataset (e.g., wikitext for demo)
dataset = load_dataset("wikitext", "wikitext-2-raw-v1", split="train[:1%]")

# Tokenize dataset
 def tokenize_function(examples):
     return tokenizer(examples["text"], truncation=True, max_length=128)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

# Training arguments
training_args = TrainingArguments(
    output_dir="./lora-domain-adapt",
    per_device_train_batch_size=8,
    num_train_epochs=3,
    logging_steps=10,
    save_steps=50,
    save_total_limit=2,
    learning_rate=3e-4,
    fp16=True,
    evaluation_strategy="no"
)

# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset
)

# Train LoRA adapters
trainer.train()

# Save LoRA adapters only
model.save_pretrained("./lora-domain-adapt")

print("LoRA domain adaptation complete.")
output
***** Running training *****
  Num examples = 288
  Num Epochs = 3
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 108
...
LoRA domain adaptation complete.

Common variations

  • Use QLoRA by combining LoRA with 4-bit quantization via BitsAndBytesConfig for even lower resource usage.
  • Apply LoRA to encoder-decoder models by targeting different modules (e.g., q_proj, v_proj).
  • Use Trainer with DataCollatorForLanguageModeling for masked language modeling tasks.
  • Streamline training with mixed precision (fp16) and gradient checkpointing for large models.

Troubleshooting

  • If training is slow or out of memory, reduce batch size or enable gradient checkpointing.
  • Ensure target_modules matches your model architecture; incorrect modules cause no adaptation.
  • Verify tokenizer padding and truncation settings to avoid input length errors.
  • Check that peft and transformers versions are compatible.

Key Takeaways

  • LoRA enables efficient domain adaptation by updating a small subset of parameters.
  • Use peft library to easily integrate LoRA with Hugging Face models.
  • Target correct model modules for LoRA to be effective.
  • Combine LoRA with quantization (QLoRA) for resource-constrained fine-tuning.
  • Always save and load only LoRA adapters to keep base model intact.
Verified 2026-04 · gpt2, transformers, peft
Verify ↗