High severity intermediate · Fix: 5-10 min

ModalGPUNotAvailableError

modal.exception.ModalGPUNotAvailableError

What this error means
Modal raises ModalGPUNotAvailableError when no GPU resources are currently available to allocate for your function or container.

Stack trace

traceback
modal.exception.ModalGPUNotAvailableError: GPU resource not available, retrying...
  File "/app/main.py", line 42, in run_gpu_task
    stub.function.call()
  File "/usr/local/lib/python3.9/site-packages/modal/function.py", line 123, in call
    raise ModalGPUNotAvailableError("GPU resource not available, retrying...")
QUICK FIX
Add retry logic with exponential backoff around your Modal GPU function calls to handle temporary GPU unavailability gracefully.

Why it happens

Modal manages GPU resources in a shared environment. When all GPUs are currently allocated to other users or tasks, Modal cannot assign a GPU to your function, triggering this error. It often occurs during peak usage or if your GPU request exceeds available capacity.

Detection

Monitor your Modal function logs for ModalGPUNotAvailableError exceptions and track GPU resource usage metrics in the Modal dashboard to anticipate shortages before failures.

Causes & fixes

1

All GPUs in the Modal pool are currently allocated to other users or tasks.

✓ Fix

Reduce GPU concurrency or schedule your GPU tasks during off-peak hours to increase chances of GPU availability.

2

Your Modal function requests more GPUs than are available in the configured pool or quota.

✓ Fix

Adjust your function's GPU resource request to a lower number within your quota or request an increased GPU quota from Modal support.

3

Transient infrastructure issues causing temporary unavailability of GPUs in Modal's backend.

✓ Fix

Implement retry logic with exponential backoff in your code to automatically retry GPU allocation after short delays.

Code: broken vs fixed

Broken - triggers the error
python
import modal

stub = modal.Stub()

@stub.function(gpu=1)
def gpu_task():
    # This call may raise ModalGPUNotAvailableError if no GPU is free
    return "GPU task done"

if __name__ == "__main__":
    print(gpu_task())  # triggers ModalGPUNotAvailableError if GPU unavailable
Fixed - works correctly
python
import os
import time
import modal

stub = modal.Stub()

@stub.function(gpu=1)
def gpu_task():
    return "GPU task done"

if __name__ == "__main__":
    max_retries = 5
    delay = 2
    for attempt in range(max_retries):
        try:
            print(gpu_task())
            break
        except modal.exception.ModalGPUNotAvailableError:
            print(f"GPU not available, retrying in {delay} seconds...")
            time.sleep(delay)
            delay *= 2  # exponential backoff
    else:
        print("Failed to acquire GPU after retries.")

# Changed: Added retry with exponential backoff to handle GPU unavailability
Added retry logic with exponential backoff to catch ModalGPUNotAvailableError and retry GPU allocation, improving robustness during GPU contention.

Workaround

Wrap your Modal GPU function calls in try/except blocks catching ModalGPUNotAvailableError and implement manual retries with delays to handle temporary GPU shortages.

Prevention

Design your Modal workloads to request GPUs conservatively, schedule GPU-heavy tasks during low-demand periods, and use Modal's quota management to avoid over-requesting GPUs.

Python 3.9+ · modal >=0.1.0 · tested on 0.9.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.