Handle streaming timeout errors
client.chat.completions.create(stream=True) call to catch TimeoutError or network exceptions and reconnect automatically.api_error TimeoutError automatically.Why this happens
Streaming timeout errors happen when the network connection to the AI API server is slow, unstable, or interrupted during a streaming response. This can cause the client to raise a TimeoutError or similar network exception while waiting for the next chunk of data.
Typical triggering code looks like this:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
If the connection stalls or the server delays, the client may timeout and raise an error, causing your app to crash or hang.
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
try:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
except TimeoutError as e:
print(f"Streaming timeout error: {e}") Streaming timeout error: The read operation timed out
The fix
Wrap the streaming call in a retry loop with exponential backoff to catch TimeoutError and reconnect. This ensures your app recovers from transient network issues without crashing.
The example below retries up to 3 times with increasing delays before giving up.
from openai import OpenAI
import os
import time
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
max_retries = 3
retry_delay = 1 # seconds
for attempt in range(max_retries):
try:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
break # success, exit retry loop
except TimeoutError as e:
print(f"TimeoutError on attempt {attempt + 1}: {e}")
if attempt == max_retries - 1:
print("Max retries reached, aborting.")
raise
time.sleep(retry_delay * 2 ** attempt) # exponential backoff Hello, this is a streamed response... TimeoutError on attempt 1: The read operation timed out Hello, continuing after retry...
Preventing it in production
- Implement robust retry logic with exponential backoff for all streaming calls to handle transient network failures gracefully.
- Set reasonable client-side timeouts and monitor network health to detect persistent issues early.
- Use circuit breakers or fallback responses if retries repeatedly fail to maintain user experience.
- Log timeout errors with context to diagnose and improve network reliability.
Key Takeaways
- Wrap streaming calls in retry loops with exponential backoff to handle timeouts.
- Catch
TimeoutErrorexplicitly to reconnect streaming safely. - Monitor and log streaming errors to improve network reliability.
- Use circuit breakers or fallbacks to maintain UX during persistent failures.