RuntimeError
asyncio.exceptions.RuntimeError
Stack trace
RuntimeError: This event loop is already running
File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 237, in app
result = await dependant.call(**values)
File "/app/main.py", line 42, in generate_response
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages) # blocking call
File "/usr/local/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 45, in create
return self._client.request("post", self._get_url(), json=params)
File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 123, in request
resp = self._session.request(method, url, **kwargs)
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1503, in request
raise RuntimeError("Cannot call async client methods from sync code")
RuntimeError: This event loop is already running Why it happens
FastAPI endpoints are async by default, but calling synchronous or blocking LLM client methods inside async routes blocks the event loop. This prevents concurrent requests from running, causing delays or deadlocks. The LLM SDK calls must be awaited asynchronously or run in a separate thread to avoid blocking.
Detection
Monitor FastAPI logs for RuntimeError about event loop already running or slow response times under concurrent load. Use async-aware LLM client methods or concurrency tools to detect blocking.
Causes & fixes
Calling synchronous LLM client methods inside async FastAPI endpoints blocks the event loop.
Use the async version of the LLM client methods (e.g., await client.chat.completions.acreate(...)) to avoid blocking.
Using blocking I/O calls (like requests or sync OpenAI calls) directly in async routes.
Run blocking calls in a thread pool executor using FastAPI's run_in_threadpool or asyncio.to_thread.
Not awaiting async LLM calls, causing implicit blocking or event loop conflicts.
Ensure all LLM calls are awaited properly with async/await syntax.
Code: broken vs fixed
from fastapi import FastAPI
from openai import OpenAI
app = FastAPI()
client = OpenAI()
@app.get("/generate")
async def generate_response():
messages = [{"role": "user", "content": "Hello"}]
# This synchronous call blocks the event loop and causes RuntimeError
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
return response import os
from fastapi import FastAPI
from openai import OpenAI
app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
@app.get("/generate")
async def generate_response():
messages = [{"role": "user", "content": "Hello"}]
# Use async call with await to avoid blocking
response = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
return response
# Added async call and awaited it to fix blocking issue Workaround
Wrap blocking LLM calls inside asyncio.to_thread or FastAPI's run_in_threadpool to run them in a separate thread and avoid blocking the event loop temporarily.
Prevention
Always use async LLM client methods in FastAPI async endpoints or isolate blocking calls in thread pools to maintain concurrency and responsiveness.