How to beginner · 3 min read

How to serve Claude responses with FastAPI

Quick answer
Use the anthropic Python SDK to call Claude models and serve responses via FastAPI. Initialize the Anthropic client with your API key from os.environ, then create an endpoint that sends user messages to client.messages.create with the claude-3-5-sonnet-20241022 model and returns the AI's reply.

PREREQUISITES

  • Python 3.8+
  • Anthropic API key
  • pip install anthropic fastapi uvicorn

Setup

Install the required packages and set your Anthropic API key as an environment variable.

  • Install packages: pip install anthropic fastapi uvicorn
  • Set environment variable in your shell: export ANTHROPIC_API_KEY='your_api_key_here'
bash
pip install anthropic fastapi uvicorn

Step by step

Create a FastAPI app that accepts user input, calls the Claude model, and returns the response.

python
import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from anthropic import Anthropic

app = FastAPI()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

class UserMessage(BaseModel):
    message: str

@app.post("/chat")
async def chat_with_claude(user_message: UserMessage):
    try:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=1000,
            system="You are a helpful assistant.",
            messages=[{"role": "user", "content": user_message.message}]
        )
        return {"response": response.content[0].text}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run: uvicorn filename:app --reload

Common variations

You can modify the example to use async calls if supported, change the Claude model version, or add streaming responses.

  • Use claude-opus-4 or other Claude models by changing the model parameter.
  • Implement streaming by integrating with FastAPI's StreamingResponse and the SDK's streaming features (if available).
  • Use synchronous calls if preferred by removing async and awaiting.
python
import os
from fastapi import FastAPI
from pydantic import BaseModel
from anthropic import Anthropic

app = FastAPI()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

class UserMessage(BaseModel):
    message: str

@app.post("/chat")
def chat_with_claude(user_message: UserMessage):
    response = client.messages.create(
        model="claude-opus-4",
        max_tokens=500,
        system="You are a helpful assistant.",
        messages=[{"role": "user", "content": user_message.message}]
    )
    return {"response": response.content[0].text}

Troubleshooting

  • If you get authentication errors, verify your ANTHROPIC_API_KEY environment variable is set correctly.
  • For timeout or network errors, check your internet connection and API endpoint availability.
  • If the response is empty or incomplete, try increasing max_tokens or check for API usage limits.

Key Takeaways

  • Use the official anthropic SDK with environment-based API keys for secure integration.
  • Serve Claude responses in FastAPI by creating a POST endpoint that calls client.messages.create.
  • Adjust model versions and parameters like max_tokens to fit your use case.
  • Handle exceptions gracefully to avoid server crashes and provide meaningful error messages.
  • Test locally with uvicorn filename:app --reload before deploying.
Verified 2026-04 · claude-3-5-sonnet-20241022, claude-opus-4
Verify ↗