TimeoutError
asyncio.exceptions.TimeoutError
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
response = client.inference.create(model="together/gpt-neox-20b", prompt="Hello")
File "/usr/local/lib/python3.9/site-packages/togetherai/client.py", line 88, in create
return asyncio.run(self._send_request(payload))
File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.9/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.9/site-packages/togetherai/client.py", line 75, in _send_request
await asyncio.wait_for(self._session.post(self._url, json=payload), timeout=10)
asyncio.exceptions.TimeoutError Why it happens
Together AI inference timeout error happens when the request to the Together AI model server exceeds the configured timeout limit, often due to network latency, server overload, or large prompt processing time. The client SDK uses asyncio with a timeout parameter that triggers this exception if the server does not respond in time.
Detection
Monitor your inference calls for asyncio TimeoutError exceptions and log request durations; set alerts on repeated timeouts to catch network or server issues early.
Causes & fixes
Network latency or connectivity issues causing delayed server response
Check your network connection and retry the request with exponential backoff to handle transient network delays.
Together AI server overloaded or slow due to high traffic or large prompt size
Reduce prompt size or complexity, or implement retry logic with increased timeout to accommodate longer processing times.
Client-side timeout parameter set too low for the inference request
Increase the timeout value in the client SDK call to allow more time for the server to respond.
Code: broken vs fixed
from togetherai import TogetherAI
client = TogetherAI(api_key="my_api_key")
response = client.inference.create(model="together/gpt-neox-20b", prompt="Hello") # This line raises TimeoutError import os
from togetherai import TogetherAI
import asyncio
os.environ["TOGETHERAI_API_KEY"] = "your_api_key_here"
client = TogetherAI(api_key=os.environ["TOGETHERAI_API_KEY"])
async def run_inference():
try:
response = await asyncio.wait_for(
client.inference.create(model="together/gpt-neox-20b", prompt="Hello"),
timeout=30 # Increased timeout from default
)
print(response)
except asyncio.TimeoutError:
print("Inference request timed out. Consider retrying with backoff.")
asyncio.run(run_inference()) # Added async call with increased timeout and error handling Workaround
Wrap the inference call in a try/except block catching asyncio.TimeoutError, then retry the request with exponential backoff or fallback to a cached response if available.
Prevention
Implement robust retry logic with exponential backoff and increase client-side timeout settings; monitor network health and Together AI server status to avoid hitting timeouts.