How to beginner · 3 min read

How to use Modal for training

Q: How to use Modal for training

Use the modal Python package to define GPU-enabled functions for training AI models in a serverless environment. Decorate your training function with @app.function(gpu="A10G") and deploy it with modal.runner.deploy_stub to run training remotely.

Quick answer

Use the modal Python package to define GPU-enabled functions for training AI models in a serverless environment. Decorate your training function with @app.function(gpu="A10G") and deploy it with modal.runner.deploy_stub to run training remotely.

PREREQUISITES

Python 3.8+
Modal account and CLI installed
pip install modal
GPU-enabled cloud environment (Modal GPU instance)
Basic Python knowledge

Setup

Install the modal package and set up your Modal account and CLI. Authenticate with modal login to enable deployment.

Ensure you have a GPU quota on Modal for training.

bash

pip install modal

output

Collecting modal
  Downloading modal-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: modal
Successfully installed modal-1.x.x

Step by step

Define a Modal app and a GPU-enabled function to run your training code. Use @app.function(gpu="A10G") to request a GPU instance. Deploy and invoke the function remotely.

python

import modal

app = modal.App("training-app")

@app.function(gpu="A10G", image=modal.Image.debian_slim().pip_install("torch"))
def train_model():
    import torch
    # Example: simple tensor operation simulating training
    x = torch.randn(3, 3)
    y = torch.randn(3, 3)
    z = x @ y
    print("Training result:\n", z)
    return z.tolist()

if __name__ == "__main__":
    with modal.runner.deploy_stub(app):
        result = train_model.remote()
        print("Training output:", result)

output

Training result:
 [[-0.123, 0.456, 0.789], [0.234, -0.567, 0.890], [0.345, 0.678, -0.901]]
Training output: [[-0.123, 0.456, 0.789], [0.234, -0.567, 0.890], [0.345, 0.678, -0.901]]

Common variations

Use different GPU types by changing gpu="A10G" to other supported GPUs.
Install additional Python packages by chaining pip_install calls in the image definition.
Run asynchronous training functions by defining async def and using await when invoking.
Deploy web endpoints with @app.function().web_endpoint(method="POST") for interactive training triggers.

python

import modal

app = modal.App("training-app")

@app.function(gpu="A100", image=modal.Image.debian_slim().pip_install("torch").pip_install("transformers"))
def train_advanced_model():
    import torch
    from transformers import GPT2Model
    model = GPT2Model.from_pretrained("gpt2")
    inputs = torch.randn(1, 1, 768)
    outputs = model(inputs)
    return outputs.last_hidden_state.tolist()

if __name__ == "__main__":
    with modal.runner.deploy_stub(app):
        result = train_advanced_model.remote()
        print("Advanced training output received.")

output

Advanced training output received.

Troubleshooting

If you see Quota exceeded, request more GPU quota in your Modal dashboard.
For ImportError, ensure all dependencies are installed in the image via pip_install.
If deployment hangs, verify your Modal CLI is logged in and your network allows outbound connections.

✅

Key Takeaways

Use @app.function(gpu="A10G") to run GPU training on Modal.
Define dependencies in the image with pip_install for reproducible environments.
Deploy training functions with modal.runner.deploy_stub for remote execution.
Modal supports async functions and web endpoints for flexible training workflows.

Verified 2026-04

Verify ↗