High severity intermediate · Fix: 2-5 min

ContextWindowTooLongError

anthropic.errors.ContextWindowTooLongError

What this error means
Anthropic's API rejects requests when the combined prompt and completion exceed the model's maximum context window size.

Stack trace

traceback
anthropic.errors.ContextWindowTooLongError: The prompt plus completion tokens exceed the model's maximum context window size of 9000 tokens.
QUICK FIX
Shorten your prompt or reduce max_tokens so total tokens stay within the model's context window limit.

Why it happens

Anthropic models have a fixed maximum context window size (e.g., 9000 tokens). When the total tokens in the prompt plus the requested completion length exceed this limit, the API throws this error to prevent processing oversized inputs.

Detection

Monitor token usage by counting tokens in your prompt and requested completion length before sending requests; log and alert when approaching the model's context window limit.

Causes & fixes

1

Prompt text is too long and combined with requested completion exceeds the model's token limit.

✓ Fix

Reduce prompt length by summarizing or chunking input text, or decrease the max_tokens parameter for completion.

2

Requesting a completion length (max_tokens) that is too large given the prompt size.

✓ Fix

Lower the max_tokens parameter to ensure prompt tokens plus max_tokens fit within the model's context window.

3

Not accounting for token overhead from system messages or metadata in the prompt.

✓ Fix

Include all tokens from system, user, and assistant messages in token count calculations to stay within limits.

Code: broken vs fixed

Broken - triggers the error
python
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Very long prompt text exceeding context window..."}],
    max_tokens=1000  # This line triggers ContextWindowTooLongError
)
print(response.content)
Fixed - works correctly
python
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])

# Shortened prompt and reduced max_tokens to fit context window
short_prompt = "Summarized or chunked prompt text within token limits."
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": short_prompt}],
    max_tokens=500  # Reduced to avoid context window overflow
)
print(response.content)  # Fixed: no ContextWindowTooLongError
Reduced prompt length and max_tokens to ensure total tokens fit within Anthropic model's context window, preventing the error.

Workaround

Catch ContextWindowTooLongError, then programmatically truncate or chunk the prompt and retry the request with smaller max_tokens.

Prevention

Implement token counting before API calls and design prompts to stay well below the model's context window limit, using chunking or summarization as needed.

Python 3.9+ · anthropic >=0.20.0 · tested on 0.20.x
Verified 2026-04 · claude-2, claude-3-5-sonnet-20241022
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.