How to beginner · 3 min read

How to build chatbot with FastAPI

Quick answer
Use FastAPI to create a web server that handles chat requests and integrates with the OpenAI SDK to generate responses from models like gpt-4o. Define an endpoint that accepts user messages, calls client.chat.completions.create(), and returns the AI's reply.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install fastapi uvicorn openai>=1.0

Setup

Install required packages and set your OpenAI API key as an environment variable.

  • Install FastAPI, Uvicorn (ASGI server), and OpenAI SDK:
bash
pip install fastapi uvicorn openai>=1.0

Step by step

Create a FastAPI app with a POST endpoint /chat that accepts JSON messages, calls OpenAI's gpt-4o model, and returns the chatbot response.

python
import os
from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

class ChatRequest(BaseModel):
    messages: list

@app.post("/chat")
async def chat_endpoint(chat_request: ChatRequest):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=chat_request.messages
    )
    return {"reply": response.choices[0].message.content}

# To run: uvicorn filename:app --reload

Common variations

You can use asynchronous calls, enable streaming responses for real-time tokens, or switch to other models like gpt-4o-mini. For streaming, iterate over the response chunks and yield tokens.

python
from fastapi.responses import StreamingResponse
import asyncio

@app.post("/chat_stream")
async def chat_stream(chat_request: ChatRequest):
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=chat_request.messages,
        stream=True
    )

    async def event_generator():
        async for chunk in stream:
            token = chunk.choices[0].delta.content or ""
            yield token
            await asyncio.sleep(0)  # allow event loop to run

    return StreamingResponse(event_generator(), media_type="text/plain")

Troubleshooting

  • If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
  • For CORS issues in browsers, configure FastAPI with fastapi.middleware.cors.CORSMiddleware.
  • If the model name is invalid, check for typos and update to the latest model names like gpt-4o.

Key Takeaways

  • Use FastAPI to build lightweight, async web servers for chatbot APIs.
  • Integrate OpenAI SDK with client.chat.completions.create() for chat completions.
  • Streaming responses enable real-time token delivery for better UX.
  • Always secure your API key via environment variables, never hardcode.
  • Keep model names updated to avoid breaking changes.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗