How to beginner · 3 min read

How to build a chat endpoint with FastAPI

Q: How to build a chat endpoint with FastAPI

Use FastAPI to create a web server and define a POST endpoint that accepts chat messages. Integrate the OpenAI SDK v1 to call client.chat.completions.create with model gpt-4o and return the AI's response in JSON.

Quick answer

Use FastAPI to create a web server and define a POST endpoint that accepts chat messages. Integrate the OpenAI SDK v1 to call client.chat.completions.create with model gpt-4o and return the AI's response in JSON.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install fastapi uvicorn openai>=1.0

Setup

Install FastAPI for the web framework, uvicorn as the ASGI server, and the openai SDK v1 for API calls. Set your OpenAI API key as an environment variable.

bash

pip install fastapi uvicorn openai>=1.0

# Set your API key in your shell environment
export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]

Step by step

Create a FastAPI app with a POST /chat endpoint that accepts a JSON payload with a user message. Use the OpenAI client to send this message to the gpt-4o model and return the AI's reply.

python

import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

class ChatRequest(BaseModel):
    message: str

class ChatResponse(BaseModel):
    reply: str

@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(request: ChatRequest):
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": request.message}]
        )
        reply = response.choices[0].message.content
        return ChatResponse(reply=reply)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run:
# uvicorn filename:app --reload

Common variations

Use async calls with FastAPI for concurrency.
Switch to other models like gpt-4o-mini for cost savings.
Implement streaming responses with WebSockets or Server-Sent Events.
Use Anthropic's claude-3-5-sonnet-20241022 model by swapping SDK and client calls.

Troubleshooting

If you get 401 Unauthorized, verify your OPENAI_API_KEY environment variable is set correctly.
For 500 Internal Server Error, check network connectivity and API usage limits.
Use uvicorn --reload during development to auto-reload on code changes.

✅

Key Takeaways

Use FastAPI's POST endpoint to receive chat messages and respond with AI completions.
Leverage the OpenAI SDK v1 with gpt-4o for reliable chat completions.
Always load API keys from environment variables for security.
Consider async endpoints and streaming for production-grade chat apps.
Handle exceptions gracefully to return meaningful HTTP errors.

Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-3-5-sonnet-20241022

Verify ↗