Modal spot instances explained
Quick answer
Modal spot instances are ephemeral GPU compute resources offered at a lower cost by Modal, ideal for interruptible AI workloads. Use the gpu parameter in @app.function to request spot instances, which run on spare capacity and may be preempted, enabling cost savings for batch or fault-tolerant tasks.
PREREQUISITES
Python 3.8+Modal account and CLI installedModal token configured (modal login)Basic Python knowledge
Setup
Install the modal Python package and log in to your Modal account to access spot instances.
pip install modal
modal login output
Successfully logged in as user@example.com You can now deploy and run Modal functions.
Step by step
Define a Modal app and use the gpu="A10G-spot" parameter in the @app.function decorator to request a spot instance GPU. Spot instances are cheaper but can be interrupted, so use them for fault-tolerant workloads.
import modal
app = modal.App("spot-instance-demo")
@app.function(gpu="A10G-spot")
def run_inference(prompt: str) -> str:
# Simulate AI workload
return f"Processed prompt on spot instance: {prompt}"
if __name__ == "__main__":
with app.run():
result = run_inference.call("Hello from Modal spot instance")
print(result) output
Processed prompt on spot instance: Hello from Modal spot instance
Common variations
Use different GPU types like gpu="A100-spot" depending on availability. Combine spot instances with local caching or checkpointing to handle preemptions gracefully. Use @app.function(gpu="A10G") for on-demand instances if uninterrupted runtime is required.
Troubleshooting
If your spot instance is preempted, catch exceptions and retry the task. Ensure your Modal CLI is up to date to access the latest spot instance features. Check your Modal dashboard for spot instance availability and quotas.
Key Takeaways
- Modal spot instances provide cost-effective GPU compute by using interruptible spare capacity.
- Use the gpu parameter with a spot instance type like "A10G-spot" in @app.function to request spot GPUs.
- Spot instances may be preempted, so design your workloads to be fault-tolerant or checkpoint progress.
- On-demand GPU instances are available by omitting "-spot" but cost more.
- Keep your Modal CLI updated and monitor your usage via the Modal dashboard.