How to run Stable Diffusion with Modal
Quick answer
Use the
modal Python package to run Stable Diffusion in a serverless GPU environment by defining a @app.function with GPU access and invoking the model inference inside it. Install modal and diffusers, then deploy your function to generate images from prompts.PREREQUISITES
Python 3.8+pip install modal diffusers transformers torch torchvision accelerate safetensorsNVIDIA GPU or Modal GPU instanceModal account and CLI configured
Setup
Install the required Python packages and configure your Modal environment. You need modal for serverless GPU execution and diffusers with dependencies for Stable Diffusion.
- Install packages:
pip install modal diffusers transformers torch torchvision accelerate safetensors output
Collecting modal Collecting diffusers Collecting transformers Collecting torch Collecting torchvision Collecting accelerate Collecting safetensors Successfully installed modal diffusers transformers torch torchvision accelerate safetensors
Step by step
Create a modal.App and define a GPU-enabled function that loads the Stable Diffusion pipeline and generates an image from a prompt. Then call this function locally or deploy it.
import modal
from diffusers import StableDiffusionPipeline
import torch
app = modal.App()
@modal.function(gpu="A10G", image=modal.Image.debian_slim().pip_install(
"torch", "diffusers", "transformers", "safetensors", "accelerate"))
def run_stable_diffusion(prompt: str) -> bytes:
pipe = StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16
).to("cuda")
image = pipe(prompt).images[0]
buf = image.tobytes() # Convert image to bytes for return
return buf
if __name__ == "__main__":
# Run locally
stub = app.deploy("stable-diffusion-example")
image_bytes = stub.run_stable_diffusion.call("A futuristic cityscape at sunset")
with open("output.raw", "wb") as f:
f.write(image_bytes)
print("Image bytes saved to output.raw") output
Image bytes saved to output.raw
Common variations
You can run the Stable Diffusion function asynchronously with async and await in Modal. Also, you can customize the model version or use different GPU types by changing the @modal.function decorator parameters.
import modal
from diffusers import StableDiffusionPipeline
import torch
app = modal.App()
@modal.function(gpu="A100", image=modal.Image.debian_slim().pip_install(
"torch", "diffusers", "transformers", "safetensors", "accelerate"))
async def run_stable_diffusion_async(prompt: str) -> bytes:
pipe = StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16
).to("cuda")
image = pipe(prompt).images[0]
buf = image.tobytes()
return buf
if __name__ == "__main__":
stub = app.deploy("stable-diffusion-async")
import asyncio
image_bytes = asyncio.run(stub.run_stable_diffusion_async.call("A serene forest in autumn"))
with open("output_async.raw", "wb") as f:
f.write(image_bytes)
print("Async image bytes saved to output_async.raw") output
Async image bytes saved to output_async.raw
Troubleshooting
- If you see CUDA out of memory errors, reduce the batch size or use a smaller GPU instance.
- Ensure your Modal CLI is logged in and configured with
modal login. - For dependency issues, verify all required packages are installed in the
modal.Imagepip_install list. - If the image output is corrupted, save it using standard image formats like PNG by converting the PIL image before returning.
Key Takeaways
- Use
modal.functionwithgpu="A10G"to run Stable Diffusion on GPU in Modal. - Install all required dependencies inside the Modal image with
pip_install. - Deploy your Modal app and call the function to generate images serverlessly.
- Async functions and different GPU types are supported by adjusting the decorator.
- Troubleshoot CUDA memory and dependency issues by adjusting resources and installs.