How to beginner · 3 min read

Modal pricing

Quick answer

Modal offers a usage-based pricing model primarily focused on compute time and resources consumed by your serverless functions, especially GPU usage. There is no fixed monthly fee; costs depend on the instance types and runtime duration you select in your modal.App deployments.

PREREQUISITES

Python 3.8+
Modal account with API key
pip install modal

Setup

Install the modal Python package and set your API key as an environment variable to authenticate your usage.

bash

pip install modal

export MODAL_API_KEY=os.environ["MODAL_API_KEY"]

output

Collecting modal
  Downloading modal-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: modal
Successfully installed modal-1.x.x

# No output for export command

Step by step

Here is a simple example of deploying a GPU-enabled function with Modal. Pricing depends on the GPU instance runtime and duration.

python

import modal

app = modal.App()

@modal.function(gpu="A10G", image=modal.Image.debian_slim().pip_install("torch"))
def run_inference(prompt: str) -> str:
    import torch
    # Simulate inference workload
    return f"Processed prompt: {prompt}"

if __name__ == "__main__":
    with modal.runner.deploy_stub(app):
        result = run_inference.remote("Hello from Modal")
        print(result)

output

Processed prompt: Hello from Modal

Common variations

You can run Modal functions without GPUs to reduce costs or use different GPU types. Pricing varies by instance type and region. Modal charges based on actual runtime seconds and resources used.

python

import modal

app = modal.App()

@modal.function
# No GPU, cheaper compute
async def run_cpu_task(prompt: str) -> str:
    return f"CPU processed: {prompt}"

if __name__ == "__main__":
    with modal.runner.deploy_stub(app):
        result = run_cpu_task.remote("Run on CPU")
        print(result)

output

CPU processed: Run on CPU

Troubleshooting

If you see unexpectedly high costs, check your function runtime and GPU usage duration in the Modal dashboard.
Use smaller instance types or CPU-only functions to reduce expenses.
Ensure your API key is set correctly in MODAL_API_KEY environment variable to avoid authentication errors.

✅

Key Takeaways

Modal pricing is usage-based, charging for compute time and resources consumed.
GPU instances cost more; choose instance types based on workload and budget.
No fixed monthly fees; control costs by managing runtime duration and resource selection.

Verified 2026-04

Verify ↗