High severity intermediate · Fix: 5-10 min

RuntimeError

asyncio.exceptions.RuntimeError

What this error means

FastAPI synchronous or blocking LLM calls cause concurrent requests to block, leading to slow or stalled API responses.

Stack trace

traceback

RuntimeError: This event loop is already running
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 237, in app
    result = await dependant.call(**values)
  File "/app/main.py", line 42, in generate_response
    response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)  # blocking call
  File "/usr/local/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 45, in create
    return self._client.request("post", self._get_url(), json=params)
  File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 123, in request
    resp = self._session.request(method, url, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1503, in request
    raise RuntimeError("Cannot call async client methods from sync code")
RuntimeError: This event loop is already running

QUICK FIX

Replace synchronous LLM calls with their async counterparts and await them inside FastAPI async endpoints.

Why it happens

FastAPI endpoints are async by default, but calling synchronous or blocking LLM client methods inside async routes blocks the event loop. This prevents concurrent requests from running, causing delays or deadlocks. The LLM SDK calls must be awaited asynchronously or run in a separate thread to avoid blocking.

Detection

Monitor FastAPI logs for RuntimeError about event loop already running or slow response times under concurrent load. Use async-aware LLM client methods or concurrency tools to detect blocking.

Causes & fixes

Calling synchronous LLM client methods inside async FastAPI endpoints blocks the event loop.

✓ Fix

Use the async version of the LLM client methods (e.g., await client.chat.completions.acreate(...)) to avoid blocking.

Using blocking I/O calls (like requests or sync OpenAI calls) directly in async routes.

✓ Fix

Run blocking calls in a thread pool executor using FastAPI's run_in_threadpool or asyncio.to_thread.

Not awaiting async LLM calls, causing implicit blocking or event loop conflicts.

✓ Fix

Ensure all LLM calls are awaited properly with async/await syntax.

Code: broken vs fixed

Broken - triggers the error

python

from fastapi import FastAPI
from openai import OpenAI

app = FastAPI()
client = OpenAI()

@app.get("/generate")
async def generate_response():
    messages = [{"role": "user", "content": "Hello"}]
    # This synchronous call blocks the event loop and causes RuntimeError
    response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
    return response

Fixed - works correctly

python

import os
from fastapi import FastAPI
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

@app.get("/generate")
async def generate_response():
    messages = [{"role": "user", "content": "Hello"}]
    # Use async call with await to avoid blocking
    response = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
    return response

# Added async call and awaited it to fix blocking issue

Changed synchronous LLM call to async acall with await to prevent blocking FastAPI's event loop and allow concurrent requests.

⚠

Workaround

Wrap blocking LLM calls inside asyncio.to_thread or FastAPI's run_in_threadpool to run them in a separate thread and avoid blocking the event loop temporarily.

✓

Prevention

Always use async LLM client methods in FastAPI async endpoints or isolate blocking calls in thread pools to maintain concurrency and responsiveness.

Python 3.9+ · fastapi >=0.70.0 · tested on 0.95.x

Verified 2026-04 · gpt-4o-mini, claude-3-5-haiku-20241022

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.