How to use Hugging Face Diffusers in Python
Direct answer
Use the diffusers Python library to load a Stable Diffusion pipeline and generate images by passing a text prompt to the pipeline's call method.
Setup
Install
pip install diffusers transformers torch Imports
from diffusers import StableDiffusionPipeline
import torch Examples
inA futuristic cityscape at sunset
outGenerates a high-quality image depicting a futuristic cityscape with warm sunset colors.
inA fantasy dragon flying over mountains
outProduces an image of a detailed dragon soaring above rugged mountain peaks.
inAn astronaut riding a horse on Mars
outCreates a surreal image showing an astronaut on horseback on the Martian surface.
Integration steps
- Install the diffusers, transformers, and torch packages.
- Import StableDiffusionPipeline from diffusers and torch.
- Load the Stable Diffusion pipeline with a pretrained model and set the device to CUDA if available.
- Call the pipeline with a text prompt to generate an image tensor.
- Save or display the generated image.
Full code
from diffusers import StableDiffusionPipeline
import torch
# Load the Stable Diffusion pipeline
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
# Use GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)
# Generate an image from a prompt
prompt = "A futuristic cityscape at sunset"
image = pipeline(prompt).images[0]
# Save the image
image.save("output.png")
print("Image saved as output.png") output
Image saved as output.png
API trace
Request
{"model_id": "runwayml/stable-diffusion-v1-5", "prompt": "A futuristic cityscape at sunset", "dtype": "float16", "device": "cuda"} Response
{"images": [<PIL.Image.Image object>], "nsfw_content_detected": false} Extract
image = pipeline(prompt).images[0]Variants
Streaming image generation with progress ›
Use when you want to monitor generation progress in the console for longer prompts or slower hardware.
from diffusers import StableDiffusionPipeline
import torch
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)
# Enable progress bar
image = pipeline("A fantasy dragon flying over mountains", progress_bar=True).images[0]
image.save("dragon.png")
print("Image saved as dragon.png") Async image generation ›
Use in asynchronous Python applications to avoid blocking the event loop during image generation.
import asyncio
from diffusers import StableDiffusionPipeline
import torch
async def generate_image():
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)
# Async call
image = await asyncio.to_thread(pipeline, "An astronaut riding a horse on Mars")
image.images[0].save("mars_horse.png")
print("Image saved as mars_horse.png")
asyncio.run(generate_image()) Use a smaller model for faster inference ›
Use when you need faster generation with slightly lower image quality or fewer resources.
from diffusers import StableDiffusionPipeline
import torch
model_id = "runwayml/stable-diffusion-v1-4"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)
prompt = "A serene lake in the mountains"
image = pipeline(prompt).images[0]
image.save("lake.png")
print("Image saved as lake.png") Performance
Latency~5-15 seconds per 512x512 image on a modern GPU
CostFree for local use; cloud GPU costs vary by provider
Rate limitsNo API rate limits for local use; cloud APIs have provider-specific limits
- Keep prompts concise to reduce generation time.
- Use half-precision (float16) to speed up inference and reduce memory.
- Batch multiple prompts if supported to improve throughput.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Local GPU with diffusers | ~5-15s | Free (hardware cost only) | Full control, no API limits |
| Cloud API (e.g. Hugging Face Inference) | ~2-5s | Paid per image | Quick setup, no hardware needed |
| Smaller models | ~2-7s | Free or cheaper | Faster generation, lower quality |
Quick tip
Always move the pipeline to GPU with pipeline.to('cuda') for significantly faster image generation.
Common mistake
Not setting the pipeline device to CUDA when a GPU is available, resulting in slow CPU inference.