How to beginner · 3 min read

How to build OpenAI-compatible API with FastAPI

Q: How to build OpenAI-compatible API with FastAPI

Use FastAPI to build a REST API that mimics OpenAI's chat completions endpoint by forwarding requests to the OpenAI SDK client. Implement the POST endpoint /v1/chat/completions accepting model and messages, then return the response in OpenAI's standard format.

Quick answer

Use FastAPI to build a REST API that mimics OpenAI's chat completions endpoint by forwarding requests to the OpenAI SDK client. Implement the POST endpoint /v1/chat/completions accepting model and messages, then return the response in OpenAI's standard format.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install fastapi uvicorn openai>=1.0

Setup

Install FastAPI for the web framework, uvicorn as the ASGI server, and the official openai SDK for API calls. Set your OpenAI API key as an environment variable.

bash

pip install fastapi uvicorn openai>=1.0

# Set environment variable (Linux/macOS)
export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]

# On Windows PowerShell
$env:OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]

Step by step

Create a FastAPI app with a POST endpoint /v1/chat/completions that accepts JSON payload with model and messages. Use the OpenAI SDK client to call chat.completions.create and return the response in OpenAI's standard format.

python

import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List
from openai import OpenAI

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

class Message(BaseModel):
    role: str
    content: str

class ChatCompletionRequest(BaseModel):
    model: str
    messages: List[Message]

@app.post("/v1/chat/completions")
async def chat_completions(request: ChatCompletionRequest):
    try:
        response = client.chat.completions.create(
            model=request.model,
            messages=[{"role": m.role, "content": m.content} for m in request.messages]
        )
        return response
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run:
# uvicorn filename:app --reload

Common variations

Async client calls: The current OpenAI Python SDK does not support async calls natively; use FastAPI's async endpoint with sync calls.
Streaming responses: Implement server-sent events or WebSocket to stream tokens, but requires custom handling beyond this basic example.
Different models: Change the model parameter to any supported OpenAI model like gpt-4o or gpt-4o-mini.

Troubleshooting

If you get KeyError for OPENAI_API_KEY, ensure the environment variable is set and your server restarted.
For 500 Internal Server Error, check your API key validity and network connectivity.
Use uvicorn filename:app --reload to auto-reload on code changes and see detailed error logs.

✅

Key Takeaways

Use FastAPI to create a POST endpoint matching OpenAI's chat completions API signature.
Leverage the official OpenAI Python SDK v1+ for backend calls with environment-based API keys.
Customize model and messages parameters to support any OpenAI chat model.
Handle errors gracefully with FastAPI's HTTPException for production readiness.
Streaming and async calls require additional implementation beyond basic sync calls.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗