High severity intermediate · Fix: 5-10 min

TimeoutError

asyncio.exceptions.TimeoutError

What this error means

FastAPI StreamingResponse raises a TimeoutError when the LLM streaming response takes longer than the allowed timeout period.

Stack trace

traceback

Traceback (most recent call last):
  File "/app/main.py", line 45, in stream_llm_response
    async for chunk in llm_stream:
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 481, in wait_for
    raise asyncio.exceptions.TimeoutError
asyncio.exceptions.TimeoutError

QUICK FIX

Wrap the LLM streaming generator with asyncio.wait_for and increase the timeout value to prevent premature TimeoutError.

Why it happens

FastAPI's StreamingResponse depends on timely chunks from the LLM streaming generator. If the LLM or network delays cause no data to arrive within the timeout window, asyncio raises a TimeoutError. This often happens when the LLM is slow or the streaming generator stalls.

Detection

Monitor FastAPI logs for asyncio TimeoutError exceptions during streaming endpoints and track response latency metrics to detect slow or stalled LLM streams before client timeouts.

Causes & fixes

LLM streaming generator stalls or delays sending chunks beyond FastAPI's default timeout

✓ Fix

Increase FastAPI StreamingResponse timeout by wrapping the generator with asyncio.wait_for with a higher timeout or configure server timeout settings accordingly.

Network latency or slow LLM model response causes delayed streaming chunks

✓ Fix

Use a faster or smaller LLM model or optimize network connectivity to reduce streaming delays.

Improper async generator implementation causing blocking or deadlocks

✓ Fix

Ensure the LLM streaming generator yields chunks promptly and does not block the event loop.

Code: broken vs fixed

Broken - triggers the error

python

from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

async def llm_stream():
    # Simulate slow streaming response
    import asyncio
    await asyncio.sleep(10)  # causes timeout
    yield b"data chunk"

@app.get("/stream")
async def stream():
    return StreamingResponse(llm_stream(), media_type="text/event-stream")  # This triggers TimeoutError

Fixed - works correctly

python

import os
import asyncio
from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

async def llm_stream():
    # Simulate slow streaming response
    await asyncio.sleep(10)  # still slow
    yield b"data chunk"

async def wrapped_stream():
    # Increase timeout to 30 seconds
    try:
        async for chunk in asyncio.wait_for(llm_stream(), timeout=30):
            yield chunk
    except asyncio.TimeoutError:
        yield b"event: error\ndata: Timeout occurred\n\n"

@app.get("/stream")
async def stream():
    # Changed to wrapped_stream with increased timeout
    return StreamingResponse(wrapped_stream(), media_type="text/event-stream")

# Use os.environ for API keys if needed (not shown here as no API keys used)

Wrapped the LLM streaming generator with asyncio.wait_for and increased the timeout to 30 seconds to prevent FastAPI StreamingResponse from timing out prematurely.

⚠

Workaround

Catch asyncio.TimeoutError around the streaming generator and send a fallback message or retry logic to keep the connection alive temporarily.

✓

Prevention

Design LLM streaming generators to yield data frequently and configure FastAPI and server timeouts to accommodate expected LLM response times, or use heartbeat messages to keep streams alive.

Python 3.8+ · fastapi >=0.70.0 · tested on 0.95.0

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.