How to handle streaming errors
Quick answer
Handle streaming errors in
OpenAI API calls by wrapping the streaming loop in a try-except block to catch exceptions like APIError or ConnectionError. Implement exponential backoff retries to recover from transient network or rate limit issues during streaming. ERROR TYPE
api_error ⚡ QUICK FIX
Add exponential backoff retry logic around your API call to handle
RateLimitError automatically.Why this happens
Streaming errors occur due to network interruptions, API rate limits, or server-side issues during the stream=True chat completion calls. For example, a broken connection or a RateLimitError can cause the streaming generator to raise exceptions, abruptly stopping the data flow.
Typical broken code looks like this, where no error handling is applied:
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True) output
Hello, how can I assist you today?
The fix
Wrap the streaming call in a try-except block and implement exponential backoff retries to handle transient errors gracefully. This approach retries the streaming request after increasing delays, preventing immediate failure on recoverable errors.
This code snippet retries up to 3 times with backoff delays:
import time
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
max_retries = 3
retry_delay = 1 # seconds
for attempt in range(max_retries):
try:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
break # success, exit retry loop
except Exception as e:
print(f"Streaming error: {e}")
if attempt < max_retries - 1:
time.sleep(retry_delay)
retry_delay *= 2 # exponential backoff
else:
print("Max retries reached. Streaming failed.") output
Hello, how can I assist you today?
Preventing it in production
- Use robust retry logic with exponential backoff and jitter to avoid hammering the API during outages or rate limits.
- Validate API keys and network connectivity before streaming calls.
- Implement fallback mechanisms such as switching to non-streaming completions or cached responses if streaming repeatedly fails.
- Monitor error rates and alert on spikes to proactively address API or network issues.
Key Takeaways
- Always wrap streaming calls in try-except blocks to catch runtime errors.
- Implement exponential backoff retries to handle transient streaming failures.
- Monitor and alert on streaming error rates to maintain production reliability.