How to Intermediate · 3 min read

How to run Stable Diffusion with Modal

Q: How to run Stable Diffusion with Modal

Use the modal Python package to run Stable Diffusion in a serverless GPU environment by defining a @app.function with GPU access and invoking the model inference inside it. Install modal and diffusers, then deploy your function to generate images from prompts.

Quick answer

Use the modal Python package to run Stable Diffusion in a serverless GPU environment by defining a @app.function with GPU access and invoking the model inference inside it. Install modal and diffusers, then deploy your function to generate images from prompts.

PREREQUISITES

Python 3.8+
pip install modal diffusers transformers torch torchvision accelerate safetensors
NVIDIA GPU or Modal GPU instance
Modal account and CLI configured

Setup

Install the required Python packages and configure your Modal environment. You need modal for serverless GPU execution and diffusers with dependencies for Stable Diffusion.

Install packages:

bash

pip install modal diffusers transformers torch torchvision accelerate safetensors

output

Collecting modal
Collecting diffusers
Collecting transformers
Collecting torch
Collecting torchvision
Collecting accelerate
Collecting safetensors
Successfully installed modal diffusers transformers torch torchvision accelerate safetensors

Step by step

Create a modal.App and define a GPU-enabled function that loads the Stable Diffusion pipeline and generates an image from a prompt. Then call this function locally or deploy it.

python

import modal
from diffusers import StableDiffusionPipeline
import torch

app = modal.App()

@modal.function(gpu="A10G", image=modal.Image.debian_slim().pip_install(
    "torch", "diffusers", "transformers", "safetensors", "accelerate"))
def run_stable_diffusion(prompt: str) -> bytes:
    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16
    ).to("cuda")
    image = pipe(prompt).images[0]
    buf = image.tobytes()  # Convert image to bytes for return
    return buf

if __name__ == "__main__":
    # Run locally
    stub = app.deploy("stable-diffusion-example")
    image_bytes = stub.run_stable_diffusion.call("A futuristic cityscape at sunset")
    with open("output.raw", "wb") as f:
        f.write(image_bytes)
    print("Image bytes saved to output.raw")

output

Image bytes saved to output.raw

Common variations

You can run the Stable Diffusion function asynchronously with async and await in Modal. Also, you can customize the model version or use different GPU types by changing the @modal.function decorator parameters.

python

import modal
from diffusers import StableDiffusionPipeline
import torch

app = modal.App()

@modal.function(gpu="A100", image=modal.Image.debian_slim().pip_install(
    "torch", "diffusers", "transformers", "safetensors", "accelerate"))
async def run_stable_diffusion_async(prompt: str) -> bytes:
    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16
    ).to("cuda")
    image = pipe(prompt).images[0]
    buf = image.tobytes()
    return buf

if __name__ == "__main__":
    stub = app.deploy("stable-diffusion-async")
    import asyncio
    image_bytes = asyncio.run(stub.run_stable_diffusion_async.call("A serene forest in autumn"))
    with open("output_async.raw", "wb") as f:
        f.write(image_bytes)
    print("Async image bytes saved to output_async.raw")

output

Async image bytes saved to output_async.raw

Troubleshooting

If you see CUDA out of memory errors, reduce the batch size or use a smaller GPU instance.
Ensure your Modal CLI is logged in and configured with modal login.
For dependency issues, verify all required packages are installed in the modal.Image pip_install list.
If the image output is corrupted, save it using standard image formats like PNG by converting the PIL image before returning.

Key Takeaways

Use modal.function with gpu="A10G" to run Stable Diffusion on GPU in Modal.
Install all required dependencies inside the Modal image with pip_install.
Deploy your Modal app and call the function to generate images serverlessly.
Async functions and different GPU types are supported by adjusting the decorator.
Troubleshoot CUDA memory and dependency issues by adjusting resources and installs.

Verified 2026-04 · runwayml/stable-diffusion-v1-5

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.