How to Intermediate · 3 min read

How to use LoRA with Stable Diffusion

Quick answer
Use LoRA (Low-Rank Adaptation) to fine-tune Stable Diffusion models by applying lightweight adapters that modify model weights efficiently. This involves loading a base Stable Diffusion model, applying LoRA weights via libraries like diffusers or peft, and running inference with the adapted model.

PREREQUISITES

  • Python 3.8+
  • pip install torch diffusers transformers accelerate peft safetensors
  • Access to a pre-trained Stable Diffusion model
  • Basic knowledge of PyTorch and model fine-tuning

Setup

Install the necessary Python packages to work with Stable Diffusion and LoRA adapters. Use diffusers for Stable Diffusion and peft for LoRA integration. Ensure you have a compatible GPU for efficient inference.

bash
pip install torch diffusers transformers accelerate peft safetensors

Step by step

This example demonstrates loading a Stable Diffusion base model and applying LoRA weights for inference using the diffusers and peft libraries.

python
import torch
from diffusers import StableDiffusionPipeline
from peft import PeftModel

# Load base Stable Diffusion model
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# Load LoRA weights (replace with your LoRA checkpoint path or repo)
lora_path = "path/to/lora_weights"

# Wrap the UNet model with LoRA
pipe.unet = PeftModel.from_pretrained(pipe.unet, lora_path)

# Run inference with LoRA-adapted model
prompt = "A fantasy landscape, vivid colors"
image = pipe(prompt, guidance_scale=7.5).images[0]

# Save or display the image
image.save("output_lora.png")
print("Image generated and saved as output_lora.png")
output
Image generated and saved as output_lora.png

Common variations

  • Use accelerate for multi-GPU or mixed precision setups to speed up inference.
  • Apply LoRA to other components like the text encoder if supported.
  • Use different LoRA repositories or fine-tune your own LoRA adapters with peft.
  • Run inference asynchronously or in batch mode depending on your application.

Troubleshooting

  • If you get CUDA out-of-memory errors, reduce batch size or switch to mixed precision (torch_dtype=torch.float16).
  • Ensure LoRA weights are compatible with the base model architecture.
  • Check that peft and diffusers versions are up to date to avoid API mismatches.
  • If images are not generated correctly, verify the prompt and guidance scale parameters.

Key Takeaways

  • Use peft to apply LoRA adapters on Stable Diffusion's UNet model for efficient fine-tuning.
  • Install and use diffusers with torch and peft for seamless LoRA integration.
  • Adjust guidance_scale and prompt to control output quality during inference.
  • Ensure LoRA weights match the base model version to avoid compatibility issues.
  • Use mixed precision and GPU acceleration to optimize performance and memory usage.
Verified 2026-04 · runwayml/stable-diffusion-v1-5
Verify ↗