High severity intermediate · Fix: 5-10 min

ContextLengthExceededException

aws_bedrock.exceptions.ContextLengthExceededException

What this error means
AWS Bedrock rejects requests when the input prompt plus conversation history exceeds the model's maximum context length.

Stack trace

traceback
aws_bedrock.exceptions.ContextLengthExceededException: The input context length exceeded the maximum allowed tokens for this model.
QUICK FIX
Truncate input text and conversation history tokens to fit within the model's max context length before sending the request.

Why it happens

AWS Bedrock models have a fixed maximum context length that limits the total tokens in the prompt and conversation history. When this limit is exceeded, the service throws a ContextLengthExceededException to prevent processing overly long inputs.

Detection

Monitor token counts of prompts before sending requests and catch ContextLengthExceededException to log and handle context length issues proactively.

Causes & fixes

1

Input prompt plus conversation history tokens exceed the model's max context length.

✓ Fix

Truncate or summarize conversation history and reduce prompt length to fit within the model's documented maximum context length.

2

Repeatedly appending full conversation history without pruning or summarization.

✓ Fix

Implement a sliding window or summarization strategy to keep context length under the limit.

3

Using a model with a smaller context length than expected for your use case.

✓ Fix

Switch to a Bedrock model variant that supports a larger maximum context length if available.

Code: broken vs fixed

Broken - triggers the error
python
from aws_bedrock import BedrockClient

client = BedrockClient()

response = client.invoke_model(
    modelId='my-bedrock-model',
    inputText='Very long prompt text that exceeds the model context length...',
    # This line triggers ContextLengthExceededException
)
print(response)
Fixed - works correctly
python
import os
from aws_bedrock import BedrockClient

client = BedrockClient(
    aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'],
    region_name=os.environ.get('AWS_REGION', 'us-east-1')
)

# Truncate or summarize input to fit model context length
max_tokens = 4096  # example max context length for the model
input_text = 'Very long prompt text that exceeds the model context length...'
if len(input_text.split()) > max_tokens:
    input_text = ' '.join(input_text.split()[:max_tokens])  # simple truncation

response = client.invoke_model(
    modelId='my-bedrock-model',
    inputText=input_text
)
print(response)  # Fixed: truncated input to avoid context length error
Added input truncation to ensure the prompt fits within the model's maximum context length, preventing the ContextLengthExceededException.

Workaround

Catch ContextLengthExceededException and programmatically truncate or summarize the input prompt and conversation history before retrying the request.

Prevention

Implement token counting and context management strategies such as sliding windows or summarization to keep inputs within the model's maximum context length before sending requests.

Python 3.9+ · aws-bedrock-sdk >=1.0.0 · tested on 1.0.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.