How to beginner · 3 min read

How to use OpenAI API with FastAPI

Q: How to use OpenAI API with FastAPI

Use the openai SDK v1 with FastAPI by importing OpenAI from openai, initializing the client with your API key from os.environ, and creating a FastAPI route that calls client.chat.completions.create with your desired model like gpt-4o. This enables building AI-powered HTTP endpoints easily.

Quick answer

Use the openai SDK v1 with FastAPI by importing OpenAI from openai, initializing the client with your API key from os.environ, and creating a FastAPI route that calls client.chat.completions.create with your desired model like gpt-4o. This enables building AI-powered HTTP endpoints easily.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 fastapi uvicorn

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai fastapi uvicorn
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or set OPENAI_API_KEY=your_api_key (Windows)

bash

pip install openai fastapi uvicorn

Step by step

Create a FastAPI app that uses the OpenAI SDK v1 to generate chat completions with gpt-4o. The example defines a POST endpoint /chat which accepts a JSON payload with a user message and returns the AI response.

python

import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

class ChatRequest(BaseModel):
    message: str

class ChatResponse(BaseModel):
    reply: str

@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(request: ChatRequest):
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": request.message}]
        )
        reply = response.choices[0].message.content
        return ChatResponse(reply=reply)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run: uvicorn filename:app --reload

Common variations

You can adapt the integration by:

Using async calls if supported by your environment.
Switching to other models like gpt-4o-mini for cost efficiency.
Adding streaming responses with FastAPI's StreamingResponse for real-time output.
Handling multiple messages for full chat history context.

python

from fastapi import StreamingResponse
import asyncio

@app.post("/chat_stream")
async def chat_stream(request: ChatRequest):
    # Example placeholder for streaming; OpenAI SDK streaming support may vary
    async def event_generator():
        # Simulate streaming chunks
        for chunk in ["Hello", " from", " OpenAI", " API!"]:
            yield chunk
            await asyncio.sleep(0.5)
    return StreamingResponse(event_generator(), media_type="text/plain")

Troubleshooting

If you get KeyError for the API key, ensure OPENAI_API_KEY is set in your environment.
For HTTP 401 Unauthorized, verify your API key is valid and has permissions.
Timeouts or network errors may require retry logic or checking your internet connection.
Use uvicorn filename:app --reload to auto-reload on code changes during development.

✅

Key Takeaways

Use the OpenAI SDK v1 with FastAPI for clean, production-ready AI endpoints.
Always load your API key securely from environment variables.
Define Pydantic models for request and response validation in FastAPI.
Handle exceptions gracefully to return meaningful HTTP errors.
Explore streaming and async variants for advanced use cases.

Verified 2026-04 · gpt-4o

Verify ↗