How to save LoRA adapter
Quick answer
To save a
LoRA adapter after fine-tuning, use the save_pretrained() method on the PEFT model or the LoraModel instance. This stores the adapter weights and configuration to a directory for later reuse or deployment.PREREQUISITES
Python 3.8+pip install transformers peft torchBasic knowledge of Hugging Face Transformers and PEFT
Setup
Install the required Python packages transformers, peft, and torch to work with LoRA adapters and large language models.
pip install transformers peft torch Step by step
This example shows how to load a base model, apply a LoRA adapter for fine-tuning, and then save the adapter weights and config to disk.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model, TaskType
import torch
import os
# Load base model and tokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
# Configure LoRA adapter
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
task_type=TaskType.CAUSAL_LM
)
# Apply LoRA to the model
model = get_peft_model(model, lora_config)
# (Assume fine-tuning happens here...)
# Directory to save LoRA adapter
save_dir = "./lora_adapter"
os.makedirs(save_dir, exist_ok=True)
# Save the LoRA adapter weights and config
model.save_pretrained(save_dir)
print(f"LoRA adapter saved to {save_dir}") output
LoRA adapter saved to ./lora_adapter
Common variations
You can save only the LoRA adapter weights without the base model by calling save_pretrained() on the PEFT model instance as shown. For async training or different base models, the saving method remains the same.
To load the saved adapter later, use from_pretrained() on the PEFT model with the saved directory.
from peft import PeftModel
from transformers import AutoModelForCausalLM
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
# Load LoRA adapter
lora_model = PeftModel.from_pretrained(base_model, "./lora_adapter")
# Use lora_model for inference or further fine-tuning Troubleshooting
- If
save_pretrained()raises an error, ensure the directory exists or you have write permissions. - Verify that the model instance is a PEFT model; otherwise, LoRA-specific methods won't be available.
- Check that your environment has enough disk space for saving adapter weights.
Key Takeaways
- Use
model.save_pretrained()on the PEFT model to save LoRA adapter weights and config. - Saving only the adapter keeps your base model separate and lightweight for deployment.
- Load saved adapters with
PeftModel.from_pretrained()combined with the base model. - Ensure proper directory permissions and existence before saving.
- The saving method is consistent across different base models and training setups.