Debug Fix intermediate · 3 min read

Handle streaming timeout errors

Quick answer

Streaming timeout errors occur when the connection to the AI API is interrupted or delayed during streaming responses. Use a retry loop with exponential backoff around your client.chat.completions.create(stream=True) call to catch TimeoutError or network exceptions and reconnect automatically.

ERROR TYPE api_error

QUICK FIX

Add exponential backoff retry logic around your API call to handle TimeoutError automatically.

Why this happens

Streaming timeout errors happen when the network connection to the AI API server is slow, unstable, or interrupted during a streaming response. This can cause the client to raise a TimeoutError or similar network exception while waiting for the next chunk of data.

Typical triggering code looks like this:

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

If the connection stalls or the server delays, the client may timeout and raise an error, causing your app to crash or hang.

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

try:
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
        stream=True
    )
    for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)
except TimeoutError as e:
    print(f"Streaming timeout error: {e}")

output

Streaming timeout error: The read operation timed out

The fix

Wrap the streaming call in a retry loop with exponential backoff to catch TimeoutError and reconnect. This ensures your app recovers from transient network issues without crashing.

The example below retries up to 3 times with increasing delays before giving up.

python

from openai import OpenAI
import os
import time

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

max_retries = 3
retry_delay = 1  # seconds

for attempt in range(max_retries):
    try:
        stream = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Hello"}],
            stream=True
        )
        for chunk in stream:
            print(chunk.choices[0].delta.content or "", end="", flush=True)
        break  # success, exit retry loop
    except TimeoutError as e:
        print(f"TimeoutError on attempt {attempt + 1}: {e}")
        if attempt == max_retries - 1:
            print("Max retries reached, aborting.")
            raise
        time.sleep(retry_delay * 2 ** attempt)  # exponential backoff

output

Hello, this is a streamed response...
TimeoutError on attempt 1: The read operation timed out
Hello, continuing after retry...

Preventing it in production

Implement robust retry logic with exponential backoff for all streaming calls to handle transient network failures gracefully.
Set reasonable client-side timeouts and monitor network health to detect persistent issues early.
Use circuit breakers or fallback responses if retries repeatedly fail to maintain user experience.
Log timeout errors with context to diagnose and improve network reliability.

Related errors

Error	Cause	Quick fix
RateLimitError	Too many requests in short time	Add exponential backoff retry logic around API calls
ConnectionResetError	Network connection dropped	Retry streaming call with backoff
APIConnectionError	Failed to connect to API endpoint	Check network and retry with backoff

Key Takeaways

Wrap streaming calls in retry loops with exponential backoff to handle timeouts.
Catch TimeoutError explicitly to reconnect streaming safely.
Monitor and log streaming errors to improve network reliability.
Use circuit breakers or fallbacks to maintain UX during persistent failures.

Verified 2026-04 · gpt-4o-mini

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.