How to beginner · 3 min read

How to build chatbot with FastAPI

Q: How to build chatbot with FastAPI

Use FastAPI to create a web server that handles chat requests and integrates with the OpenAI SDK to generate responses from models like gpt-4o. Define an endpoint that accepts user messages, calls client.chat.completions.create(), and returns the AI's reply.

Quick answer

Use FastAPI to create a web server that handles chat requests and integrates with the OpenAI SDK to generate responses from models like gpt-4o. Define an endpoint that accepts user messages, calls client.chat.completions.create(), and returns the AI's reply.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install fastapi uvicorn openai>=1.0

Setup

Install required packages and set your OpenAI API key as an environment variable.

Install FastAPI, Uvicorn (ASGI server), and OpenAI SDK:

bash

pip install fastapi uvicorn openai>=1.0

Step by step

Create a FastAPI app with a POST endpoint /chat that accepts JSON messages, calls OpenAI's gpt-4o model, and returns the chatbot response.

python

import os
from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

class ChatRequest(BaseModel):
    messages: list

@app.post("/chat")
async def chat_endpoint(chat_request: ChatRequest):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=chat_request.messages
    )
    return {"reply": response.choices[0].message.content}

# To run: uvicorn filename:app --reload

Common variations

You can use asynchronous calls, enable streaming responses for real-time tokens, or switch to other models like gpt-4o-mini. For streaming, iterate over the response chunks and yield tokens.

python

from fastapi.responses import StreamingResponse
import asyncio

@app.post("/chat_stream")
async def chat_stream(chat_request: ChatRequest):
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=chat_request.messages,
        stream=True
    )

    async def event_generator():
        async for chunk in stream:
            token = chunk.choices[0].delta.content or ""
            yield token
            await asyncio.sleep(0)  # allow event loop to run

    return StreamingResponse(event_generator(), media_type="text/plain")

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
For CORS issues in browsers, configure FastAPI with fastapi.middleware.cors.CORSMiddleware.
If the model name is invalid, check for typos and update to the latest model names like gpt-4o.

✅

Key Takeaways

Use FastAPI to build lightweight, async web servers for chatbot APIs.
Integrate OpenAI SDK with client.chat.completions.create() for chat completions.
Streaming responses enable real-time token delivery for better UX.
Always secure your API key via environment variables, never hardcode.
Keep model names updated to avoid breaking changes.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗