High severity intermediate · Fix: 5-10 min

RunPodEndpointNotReadyError

runpod.client.errors.RunPodEndpointNotReadyError

What this error means

RunPod returns an endpoint not ready error during cold start when the deployed model container is still initializing and cannot serve requests yet.

Stack trace

traceback

runpod.client.errors.RunPodEndpointNotReadyError: Endpoint is not ready yet. Cold start in progress. Please retry after some time.

QUICK FIX

Add retry with delay on RunPodEndpointNotReadyError to wait for the endpoint readiness before retrying inference calls.

Why it happens

RunPod deploys models in containers that may take several seconds to minutes to initialize on first request (cold start). During this time, the endpoint is not ready to accept inference calls, causing this error.

Detection

Monitor API responses for RunPodEndpointNotReadyError exceptions and log timestamps to detect cold start delays before impacting users.

Causes & fixes

Model container is still starting up after deployment or scale-up

✓ Fix

Implement retry logic with exponential backoff to wait for the endpoint to become ready before sending inference requests.

Sending inference requests immediately after deployment without readiness checks

✓ Fix

Add a health check or readiness probe to confirm the endpoint is ready before routing traffic.

Insufficient resources causing slow container startup

✓ Fix

Increase allocated CPU/memory resources for the RunPod deployment to reduce cold start time.

Code: broken vs fixed

Broken - triggers the error

python

from runpod import RunPodClient
client = RunPodClient()
response = client.infer(model_id='my-model', input_data={'text': 'Hello'})  # This line raises RunPodEndpointNotReadyError

Fixed - works correctly

python

import os
import time
from runpod import RunPodClient

client = RunPodClient()

for attempt in range(5):
    try:
        response = client.infer(model_id='my-model', input_data={'text': 'Hello'})
        print('Inference response:', response)
        break
    except runpod.client.errors.RunPodEndpointNotReadyError:
        print('Endpoint not ready, retrying...')
        time.sleep(10)  # Wait 10 seconds before retrying
else:
    print('Failed to get response after retries')

# Fixed by adding retry with delay on endpoint not ready error

Added retry loop with delay to handle cold start by waiting for the RunPod endpoint to become ready before sending inference requests.

⚠

Workaround

Catch RunPodEndpointNotReadyError exceptions and parse the error message to trigger a wait-and-retry mechanism instead of failing immediately.

✓

Prevention

Use RunPod health checks or readiness probes to confirm endpoint availability before sending traffic, and allocate sufficient resources to minimize cold start delays.

Python 3.9+ · runpod-client >=0.1.0 · tested on 0.2.x

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.