How to serve Claude responses with FastAPI
Quick answer
Use the
anthropic Python SDK to call Claude models and serve responses via FastAPI. Initialize the Anthropic client with your API key from os.environ, then create an endpoint that sends user messages to client.messages.create with the claude-3-5-sonnet-20241022 model and returns the AI's reply.PREREQUISITES
Python 3.8+Anthropic API keypip install anthropic fastapi uvicorn
Setup
Install the required packages and set your Anthropic API key as an environment variable.
- Install packages:
pip install anthropic fastapi uvicorn - Set environment variable in your shell:
export ANTHROPIC_API_KEY='your_api_key_here'
pip install anthropic fastapi uvicorn Step by step
Create a FastAPI app that accepts user input, calls the Claude model, and returns the response.
import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from anthropic import Anthropic
app = FastAPI()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
class UserMessage(BaseModel):
message: str
@app.post("/chat")
async def chat_with_claude(user_message: UserMessage):
try:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_message.message}]
)
return {"response": response.content[0].text}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
# To run: uvicorn filename:app --reload Common variations
You can modify the example to use async calls if supported, change the Claude model version, or add streaming responses.
- Use
claude-opus-4or other Claude models by changing themodelparameter. - Implement streaming by integrating with FastAPI's
StreamingResponseand the SDK's streaming features (if available). - Use synchronous calls if preferred by removing
asyncand awaiting.
import os
from fastapi import FastAPI
from pydantic import BaseModel
from anthropic import Anthropic
app = FastAPI()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
class UserMessage(BaseModel):
message: str
@app.post("/chat")
def chat_with_claude(user_message: UserMessage):
response = client.messages.create(
model="claude-opus-4",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_message.message}]
)
return {"response": response.content[0].text} Troubleshooting
- If you get authentication errors, verify your
ANTHROPIC_API_KEYenvironment variable is set correctly. - For timeout or network errors, check your internet connection and API endpoint availability.
- If the response is empty or incomplete, try increasing
max_tokensor check for API usage limits.
Key Takeaways
- Use the official
anthropicSDK with environment-based API keys for secure integration. - Serve Claude responses in FastAPI by creating a POST endpoint that calls
client.messages.create. - Adjust model versions and parameters like
max_tokensto fit your use case. - Handle exceptions gracefully to avoid server crashes and provide meaningful error messages.
- Test locally with
uvicorn filename:app --reloadbefore deploying.