How to Intermediate · 3 min read

How to save LoRA adapter

Q: How to save LoRA adapter

To save a LoRA adapter after fine-tuning, use the save_pretrained() method on the PEFT model or the LoraModel instance. This stores the adapter weights and configuration to a directory for later reuse or deployment.

Quick answer

To save a LoRA adapter after fine-tuning, use the save_pretrained() method on the PEFT model or the LoraModel instance. This stores the adapter weights and configuration to a directory for later reuse or deployment.

PREREQUISITES

Python 3.8+
pip install transformers peft torch
Basic knowledge of Hugging Face Transformers and PEFT

Setup

Install the required Python packages transformers, peft, and torch to work with LoRA adapters and large language models.

bash

pip install transformers peft torch

Step by step

This example shows how to load a base model, apply a LoRA adapter for fine-tuning, and then save the adapter weights and config to disk.

python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
import torch
import os

# Load base model and tokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# Configure LoRA adapter
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    task_type=TaskType.CAUSAL_LM
)

# Apply LoRA to the model
model = get_peft_model(model, lora_config)

# (Assume fine-tuning happens here...)

# Directory to save LoRA adapter
save_dir = "./lora_adapter"
os.makedirs(save_dir, exist_ok=True)

# Save the LoRA adapter weights and config
model.save_pretrained(save_dir)

print(f"LoRA adapter saved to {save_dir}")

output

LoRA adapter saved to ./lora_adapter

Common variations

You can save only the LoRA adapter weights without the base model by calling save_pretrained() on the PEFT model instance as shown. For async training or different base models, the saving method remains the same.

To load the saved adapter later, use from_pretrained() on the PEFT model with the saved directory.

python

from peft import PeftModel
from transformers import AutoModelForCausalLM

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# Load LoRA adapter
lora_model = PeftModel.from_pretrained(base_model, "./lora_adapter")

# Use lora_model for inference or further fine-tuning

Troubleshooting

If save_pretrained() raises an error, ensure the directory exists or you have write permissions.
Verify that the model instance is a PEFT model; otherwise, LoRA-specific methods won't be available.
Check that your environment has enough disk space for saving adapter weights.

✅

Key Takeaways

Use model.save_pretrained() on the PEFT model to save LoRA adapter weights and config.
Saving only the adapter keeps your base model separate and lightweight for deployment.
Load saved adapters with PeftModel.from_pretrained() combined with the base model.
Ensure proper directory permissions and existence before saving.
The saving method is consistent across different base models and training setups.

Verified 2026-04 · meta-llama/Llama-3.1-8B-Instruct

Verify ↗