ContextLengthExceededException
aws_bedrock.exceptions.ContextLengthExceededException
Stack trace
aws_bedrock.exceptions.ContextLengthExceededException: The input context length exceeded the maximum allowed tokens for this model.
Why it happens
AWS Bedrock models have a fixed maximum context length that limits the total tokens in the prompt and conversation history. When this limit is exceeded, the service throws a ContextLengthExceededException to prevent processing overly long inputs.
Detection
Monitor token counts of prompts before sending requests and catch ContextLengthExceededException to log and handle context length issues proactively.
Causes & fixes
Input prompt plus conversation history tokens exceed the model's max context length.
Truncate or summarize conversation history and reduce prompt length to fit within the model's documented maximum context length.
Repeatedly appending full conversation history without pruning or summarization.
Implement a sliding window or summarization strategy to keep context length under the limit.
Using a model with a smaller context length than expected for your use case.
Switch to a Bedrock model variant that supports a larger maximum context length if available.
Code: broken vs fixed
from aws_bedrock import BedrockClient
client = BedrockClient()
response = client.invoke_model(
modelId='my-bedrock-model',
inputText='Very long prompt text that exceeds the model context length...',
# This line triggers ContextLengthExceededException
)
print(response) import os
from aws_bedrock import BedrockClient
client = BedrockClient(
aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'],
region_name=os.environ.get('AWS_REGION', 'us-east-1')
)
# Truncate or summarize input to fit model context length
max_tokens = 4096 # example max context length for the model
input_text = 'Very long prompt text that exceeds the model context length...'
if len(input_text.split()) > max_tokens:
input_text = ' '.join(input_text.split()[:max_tokens]) # simple truncation
response = client.invoke_model(
modelId='my-bedrock-model',
inputText=input_text
)
print(response) # Fixed: truncated input to avoid context length error Workaround
Catch ContextLengthExceededException and programmatically truncate or summarize the input prompt and conversation history before retrying the request.
Prevention
Implement token counting and context management strategies such as sliding windows or summarization to keep inputs within the model's maximum context length before sending requests.