How to beginner · 3 min read

How to use async with OpenAI SDK in FastAPI

Q: How to use async with OpenAI SDK in FastAPI

Use the OpenAI SDK's async methods within FastAPI by defining async route handlers and calling await client.chat.completions.acreate(). This enables non-blocking AI requests in your FastAPI app for better concurrency and performance.

Quick answer

Use the OpenAI SDK's async methods within FastAPI by defining async route handlers and calling await client.chat.completions.acreate(). This enables non-blocking AI requests in your FastAPI app for better concurrency and performance.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install fastapi uvicorn

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install FastAPI and Uvicorn for the web server.
Install the OpenAI SDK version 1 or higher.
Set OPENAI_API_KEY in your environment.

bash

pip install fastapi uvicorn openai>=1.0

Step by step

Create an async FastAPI endpoint that calls the OpenAI SDK's async method acreate() to generate chat completions. Use await to handle the asynchronous call properly.

python

import os
from fastapi import FastAPI
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

@app.get("/chat")
async def chat_with_openai(prompt: str):
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return {"response": response.choices[0].message.content}

# To run:
# uvicorn filename:app --reload

output

{"response": "Hello! How can I assist you today?"}

Common variations

You can use other async OpenAI SDK methods like acreate() for completions, embeddings, or edits. Also, switch models by changing the model parameter. For streaming responses, use the SDK's async streaming interface.

python

from fastapi import FastAPI
from openai import OpenAI
import os

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

@app.get("/stream-chat")
async def stream_chat(prompt: str):
    # Example of async streaming (pseudo-code, adjust per SDK docs)
    response_stream = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )
    collected = []
    async for chunk in response_stream:
        collected.append(chunk.choices[0].delta.content)
    return {"streamed_response": ''.join(collected)}

Troubleshooting

If you get RuntimeWarning: coroutine 'acreate' was never awaited, ensure your route handler is declared async def and you use await.
If you see authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
For connection timeouts, check your network and consider increasing timeout settings if supported.

✅

Key Takeaways

Use async def in FastAPI routes to enable async OpenAI SDK calls.
Call await client.chat.completions.acreate() for non-blocking AI completions.
Set your API key securely via environment variables to avoid leaks.
Streaming responses require async iteration over the response stream.
Handle common errors by verifying async usage and environment configuration.

Verified 2026-04 · gpt-4o-mini

Verify ↗