High severity intermediate · Fix: 2-5 min

ContextWindowExceededError

anthropic.errors.ContextWindowExceededError

What this error means
The Anthropic Claude model rejected the request because the total tokens exceeded its maximum context window size.

Stack trace

traceback
anthropic.errors.ContextWindowExceededError: Request tokens (input + output) exceed the model's maximum context window size of 9000 tokens.
QUICK FIX
Reduce prompt length or max_tokens so total tokens fit within the model's context window limit.

Why it happens

Anthropic Claude models have a fixed maximum context window size (e.g., 9000 tokens). When the combined tokens of the prompt plus the requested completion exceed this limit, the API returns this error to prevent processing oversized inputs.

Detection

Monitor token usage by summing input prompt tokens and expected output tokens before sending requests; log token counts and catch ContextWindowExceededError exceptions to identify over-limit calls.

Causes & fixes

1

Input prompt plus requested completion tokens exceed Claude's max context window (e.g., 9000 tokens).

✓ Fix

Reduce the input prompt length or lower the max_tokens parameter to ensure total tokens stay within the model's context window.

2

Repeatedly appending conversation history without truncation causes token count to grow beyond limits.

✓ Fix

Implement conversation history truncation or summarization to keep prompt tokens under the context window limit.

3

Using a model variant with a smaller context window than expected (e.g., Claude 1 vs Claude 2).

✓ Fix

Verify the model variant supports the desired context window size and switch to a larger context window model if needed.

Code: broken vs fixed

Broken - triggers the error
python
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])

response = client.messages.create(
    system="",
    model="claude-2",
    messages=[{"role": "user", "content": "A very long prompt that exceeds the context window..."}],
    max_tokens=5000  # This combined with prompt tokens exceeds limit
)  # This line triggers ContextWindowExceededError
Fixed - works correctly
python
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])

# Reduced prompt length and max_tokens to fit context window
response = client.messages.create(
    system="",
    model="claude-2",
    messages=[{"role": "user", "content": "A shorter prompt that fits within the context window"}],
    max_tokens=1000  # Reduced to avoid exceeding token limit
)
print(response.content)
Reduced prompt length and max_tokens to ensure total tokens stay within Claude's max context window, preventing the error.

Workaround

Catch ContextWindowExceededError and programmatically truncate or summarize the prompt before retrying the request.

Prevention

Track token usage client-side by encoding prompts and counting tokens before requests; implement prompt truncation or summarization to guarantee requests never exceed the model's context window.

Python 3.9+ · anthropic >=0.20.0 · tested on 0.20.x
Verified 2026-04 · claude-2, claude-1
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.