Code beginner · 3 min read

How to build image generation app with Python

Q: How to build image generation app with Python

Use the diffusers library with StableDiffusionPipeline in Python to build an image generation app that inputs text prompts and outputs images.

Direct answer

Use the diffusers library with StableDiffusionPipeline in Python to build an image generation app that inputs text prompts and outputs images.

Setup

Install

bash

pip install diffusers[torch] transformers accelerate scipy safetensors

Env vars

HUGGINGFACE_TOKEN

Imports

python

from diffusers import StableDiffusionPipeline
import torch
import os

Examples

inA futuristic cityscape at sunset

outGenerates a high-resolution image depicting a futuristic cityscape with warm sunset colors.

inA cute cat wearing a wizard hat

outProduces an image of an adorable cat dressed as a wizard with a pointed hat and magical background.

inAn astronaut riding a horse on Mars

outCreates a surreal image of an astronaut on horseback on the red Martian surface.

Integration steps

Install the required Python packages including diffusers and torch.
Set the Hugging Face API token in the environment variable HUGGINGFACE_TOKEN.
Import StableDiffusionPipeline and initialize it with the model and token.
Use the pipeline to generate images by passing text prompts.
Save or display the generated images in your app interface.

Full code

python

import os
from diffusers import StableDiffusionPipeline
import torch

hf_token = os.environ.get("HUGGINGFACE_TOKEN")
if not hf_token:
    raise ValueError("Set the HUGGINGFACE_TOKEN environment variable.")

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")

def generate_image(prompt: str, output_path: str = "output.png"):
    image = pipe(prompt).images[0]
    image.save(output_path)
    print(f"Image saved to {output_path}")

generate_image("A futuristic cityscape at sunset")

output

Image saved to output.png

API trace

Request

json

{"model_id": "runwayml/stable-diffusion-v1-5", "prompt": "A futuristic cityscape at sunset"}

Response

json

{"images": ["<PIL.Image object>"]}

Extractpipe(prompt).images[0]

Variants

Generation with progress callback ›

Use when you want to show progress to users during long image generation.

python

import os
from diffusers import StableDiffusionPipeline
import torch

hf_token = os.environ.get("HUGGINGFACE_TOKEN")
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")

def progress_callback(step: int, timestep: int, latents):
    print(f"Step {step} at timestep {timestep}")

image = pipe("A cute cat wearing a wizard hat", callback=progress_callback).images[0]
image.save("cat_wizard.png")

Async generation with asyncio ›

Use in async Python applications to avoid blocking the event loop.

python

import os
import asyncio
from diffusers import StableDiffusionPipeline
import torch

async def async_generate():
    hf_token = os.environ.get("HUGGINGFACE_TOKEN")
    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16
    )
    pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")
    image = await asyncio.to_thread(pipe, "An astronaut riding a horse on Mars")
    image.images[0].save("astronaut_mars.png")

asyncio.run(async_generate())

Use SD 2 base for faster generation ›

Use SD 2 base for faster inference with slightly lower fidelity.

python

import os
from diffusers import StableDiffusionPipeline
import torch

hf_token = os.environ.get("HUGGINGFACE_TOKEN")
pipe = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-base",
    torch_dtype=torch.float16
)
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")
image = pipe("A beautiful mountain landscape").images[0]
image.save("mountain.png")

Performance

Latency~5-15 seconds per 512x512 image on a modern GPU

CostFree for local runs; cloud GPU costs vary by provider

Rate limitsNo rate limits when running locally

Cache the pipeline object to avoid repeated model loading.
Use torch_dtype=torch.float16 for faster GPU inference.
Use lower resolution or smaller models for faster results.

Approach	Latency	Cost/call	Best for
Local GPU with diffusers	~5-15s	Free (hardware cost)	Full control, no API limits
Hugging Face Inference API	~3-10s	Paid per call	Quick setup, no local GPU needed
Lightweight models (SD 2 base)	~2-5s	Free or cheaper	Faster generation, lower quality

✓

Quick tip

Cache the pipeline object outside your generation function to avoid reloading the model on every call.

⚠

Common mistake

Forgetting to set HUGGINGFACE_TOKEN or calling pipe.to() before generating causes authentication or device errors.

Verified 2026-04 · runwayml/stable-diffusion-v1-5, stabilityai/stable-diffusion-2-base

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.