TimeoutError
asyncio.exceptions.TimeoutError
Stack trace
asyncio.exceptions.TimeoutError: The request to start the Replicate model timed out after waiting too long for the model to become ready.
Why it happens
Replicate hosts models in serverless environments that may require a cold start when invoked after inactivity. This cold start can take several seconds or more, causing the client request to exceed the default timeout and raise a TimeoutError.
Detection
Monitor invocation latency and catch asyncio TimeoutError exceptions during model calls to detect cold start delays before they cause user-facing failures.
Causes & fixes
Model container is cold and takes longer than the default timeout to start
Increase the timeout parameter in the Replicate client call to allow more time for the model to start.
Network latency or slow internet connection causing delayed responses
Ensure stable and fast network connectivity or retry the request with exponential backoff.
Using a large or complex model that inherently has longer cold start times
Pre-warm the model by sending a dummy request shortly before actual usage or schedule periodic keep-alive pings.
Code: broken vs fixed
import replicate
model = replicate.models.get("owner/model-name")
# This call may timeout if the model is cold
output = model.predict(input="some input") # triggers TimeoutError
print(output) import os
import replicate
# Use environment variable for API token
os.environ["REPLICATE_API_TOKEN"] = os.getenv("REPLICATE_API_TOKEN")
model = replicate.models.get("owner/model-name")
# Increase timeout to 60 seconds to handle cold start
output = model.predict(input="some input", timeout=60) # fixed with longer timeout
print(output) Workaround
Wrap the model.predict call in a try/except block catching TimeoutError, then retry the call after a short delay to handle transient cold start delays.
Prevention
Implement periodic keep-alive requests to the Replicate model to prevent cold starts and configure appropriate timeouts based on model size and expected latency.