Code beginner · 3 min read

How to use Hugging Face Diffusers in Python

Direct answer

Use the diffusers Python library to load a Stable Diffusion pipeline and generate images by passing a text prompt to the pipeline's call method.

Setup

Install

bash

pip install diffusers transformers torch

Imports

python

from diffusers import StableDiffusionPipeline
import torch

Examples

inA futuristic cityscape at sunset

outGenerates a high-quality image depicting a futuristic cityscape with warm sunset colors.

inA fantasy dragon flying over mountains

outProduces an image of a detailed dragon soaring above rugged mountain peaks.

inAn astronaut riding a horse on Mars

outCreates a surreal image showing an astronaut on horseback on the Martian surface.

Integration steps

Install the diffusers, transformers, and torch packages.
Import StableDiffusionPipeline from diffusers and torch.
Load the Stable Diffusion pipeline with a pretrained model and set the device to CUDA if available.
Call the pipeline with a text prompt to generate an image tensor.
Save or display the generated image.

Full code

python

from diffusers import StableDiffusionPipeline
import torch

# Load the Stable Diffusion pipeline
model_id = "runwayml/stable-diffusion-v1-5"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

# Use GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)

# Generate an image from a prompt
prompt = "A futuristic cityscape at sunset"
image = pipeline(prompt).images[0]

# Save the image
image.save("output.png")
print("Image saved as output.png")

output

Image saved as output.png

API trace

Request

json

{"model_id": "runwayml/stable-diffusion-v1-5", "prompt": "A futuristic cityscape at sunset", "dtype": "float16", "device": "cuda"}

Response

json

{"images": [<PIL.Image.Image object>], "nsfw_content_detected": false}

Extractimage = pipeline(prompt).images[0]

Variants

Streaming image generation with progress ›

Use when you want to monitor generation progress in the console for longer prompts or slower hardware.

python

from diffusers import StableDiffusionPipeline
import torch

model_id = "runwayml/stable-diffusion-v1-5"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)

# Enable progress bar
image = pipeline("A fantasy dragon flying over mountains", progress_bar=True).images[0]
image.save("dragon.png")
print("Image saved as dragon.png")

Async image generation ›

Use in asynchronous Python applications to avoid blocking the event loop during image generation.

python

import asyncio
from diffusers import StableDiffusionPipeline
import torch

async def generate_image():
    model_id = "runwayml/stable-diffusion-v1-5"
    pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
    device = "cuda" if torch.cuda.is_available() else "cpu"
    pipeline = pipeline.to(device)

    # Async call
    image = await asyncio.to_thread(pipeline, "An astronaut riding a horse on Mars")
    image.images[0].save("mars_horse.png")
    print("Image saved as mars_horse.png")

asyncio.run(generate_image())

Use a smaller model for faster inference ›

Use when you need faster generation with slightly lower image quality or fewer resources.

python

from diffusers import StableDiffusionPipeline
import torch

model_id = "runwayml/stable-diffusion-v1-4"
pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

device = "cuda" if torch.cuda.is_available() else "cpu"
pipeline = pipeline.to(device)

prompt = "A serene lake in the mountains"
image = pipeline(prompt).images[0]
image.save("lake.png")
print("Image saved as lake.png")

Performance

Latency~5-15 seconds per 512x512 image on a modern GPU

CostFree for local use; cloud GPU costs vary by provider

Rate limitsNo API rate limits for local use; cloud APIs have provider-specific limits

Keep prompts concise to reduce generation time.
Use half-precision (float16) to speed up inference and reduce memory.
Batch multiple prompts if supported to improve throughput.

Approach	Latency	Cost/call	Best for
Local GPU with diffusers	~5-15s	Free (hardware cost only)	Full control, no API limits
Cloud API (e.g. Hugging Face Inference)	~2-5s	Paid per image	Quick setup, no hardware needed
Smaller models	~2-7s	Free or cheaper	Faster generation, lower quality

✓

Quick tip

Always move the pipeline to GPU with pipeline.to('cuda') for significantly faster image generation.

⚠

Common mistake

Not setting the pipeline device to CUDA when a GPU is available, resulting in slow CPU inference.

Verified 2026-04 · runwayml/stable-diffusion-v1-5, runwayml/stable-diffusion-v1-4

Verify ↗