How to intermediate · 3 min read

How to stream Claude responses to a web app

Quick answer
Use the anthropic Python SDK's client.messages.create method with stream=True to receive streamed tokens from claude-3-5-sonnet-20241022. Process the streamed chunks asynchronously or in a loop to update your web app UI in real time.

PREREQUISITES

  • Python 3.8+
  • Anthropic API key
  • pip install anthropic>=0.20

Setup

Install the official anthropic Python SDK and set your API key as an environment variable.

  • Run pip install anthropic
  • Set export ANTHROPIC_API_KEY='your_api_key' on macOS/Linux or set environment variable on Windows
bash
pip install anthropic

Step by step

This example demonstrates streaming a Claude response token-by-token and printing it to the console, which you can adapt to send updates to a web client via WebSocket or Server-Sent Events.

python
import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    system="",
    messages=[{"role": "user", "content": "Tell me a joke."}],
    max_tokens=100,
    stream=True
)

print("Streaming response:")
for chunk in response:
    # Each chunk is a dict with 'completion' key containing the new token
    print(chunk["completion"], end="", flush=True)
print()
output
Streaming response:
Why did the scarecrow win an award? Because he was outstanding in his field!

Common variations

You can integrate streaming with async frameworks like FastAPI or Starlette to push tokens to the frontend in real time. Also, you can change the model to other Claude versions or adjust max_tokens and temperature for different outputs.

python
import asyncio
import os
import anthropic
from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

async def stream_claude():
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        system="",
        messages=[{"role": "user", "content": "What's the weather like today?"}],
        max_tokens=50,
        stream=True
    )
    for chunk in response:
        yield chunk["completion"]
        await asyncio.sleep(0)  # Yield control to event loop

@app.get("/stream")
async def stream():
    return StreamingResponse(stream_claude(), media_type="text/plain")

Troubleshooting

  • If streaming does not start, verify your API key is set correctly in ANTHROPIC_API_KEY.
  • If the response is empty, check your prompt formatting and model name.
  • For network errors, ensure your environment allows outbound HTTPS requests to Anthropic's API.

Key Takeaways

  • Use stream=True in client.messages.create to enable streaming with Anthropic's Python SDK.
  • Process streamed chunks incrementally to update your web app UI in real time.
  • Integrate streaming with async web frameworks like FastAPI for live frontend updates.
  • Always set your API key securely via environment variables.
  • Verify model names and prompt formatting to avoid empty or failed responses.
Verified 2026-04 · claude-3-5-sonnet-20241022
Verify ↗