How to add structured output endpoint to FastAPI LLM app
Quick answer
Use FastAPI to define an endpoint that calls an LLM via the
openai SDK, then parse and return the model's response as structured JSON. Leverage response_format or prompt engineering to ensure the LLM outputs JSON, and use pydantic models for response validation.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install fastapi uvicorn openai pydantic
Setup
Install required packages and set your OpenAI API key as an environment variable.
- Install FastAPI, Uvicorn, OpenAI SDK, and Pydantic:
pip install fastapi uvicorn openai pydantic Step by step
Create a FastAPI app with a POST endpoint that sends a prompt to the OpenAI gpt-4o-mini model, requesting JSON output. Use a pydantic model to validate and return structured data.
import os
from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI
app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
class StructuredOutput(BaseModel):
name: str
age: int
email: str
@app.post("/structured-output", response_model=StructuredOutput)
async def structured_output(prompt: str):
# Prompt engineering to get JSON output
system_prompt = "You are a helpful assistant that outputs JSON with fields: name, age, email."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt}
]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0
)
content = response.choices[0].message.content
# Parse JSON from LLM output
import json
try:
data = json.loads(content)
except json.JSONDecodeError:
return {"name": "", "age": 0, "email": ""} # fallback empty
return StructuredOutput(**data) Common variations
- Use
asynccalls withawaitif your SDK supports it. - Switch to other models like
claude-3-5-haiku-20241022by changing the client and prompt accordingly. - Implement streaming responses for real-time output.
from anthropic import Anthropic
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
@app.post("/structured-output-anthropic", response_model=StructuredOutput)
async def structured_output_anthropic(prompt: str):
system = "You are a helpful assistant that outputs JSON with fields: name, age, email."
message = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=512,
system=system,
messages=[{"role": "user", "content": prompt}]
)
import json
try:
data = json.loads(message.content)
except json.JSONDecodeError:
return {"name": "", "age": 0, "email": ""}
return StructuredOutput(**data) Troubleshooting
- If JSON parsing fails, verify the prompt instructs the model to output strict JSON.
- Use
temperature=0to reduce randomness and improve structured output consistency. - Check your environment variable
OPENAI_API_KEYis set correctly. - For deployment, ensure
uvicornruns your FastAPI app properly.
uvicorn main:app --reload output
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
Key Takeaways
- Use prompt engineering to get LLMs to output strict JSON for structured endpoints.
- Validate and parse LLM JSON output with Pydantic models in FastAPI.
- Set temperature to 0 for deterministic structured responses.
- Use environment variables for API keys to keep credentials secure.
- Test endpoints locally with Uvicorn before deployment.