High severity intermediate · Fix: 2-5 min

ContextLengthExceededError

openai.error.ContextLengthExceededError

What this error means

The system prompt plus user messages exceed the model's maximum context window size, causing the API to reject the request.

Stack trace

traceback

openai.error.ContextLengthExceededError: The combined length of the messages exceeds the model's maximum context length. Please reduce the prompt size or use a model with a larger context window.

QUICK FIX

Reduce the system prompt length or truncate conversation history to fit within the model's max token limit.

Why it happens

OpenAI models have a fixed maximum token limit for the entire conversation context, including system prompts, user messages, and completions. When the system prompt is too long, it causes the total token count to exceed this limit, triggering the ContextLengthExceededError.

Detection

Monitor token usage before sending requests by encoding prompts with tiktoken or similar tokenizers and assert the total tokens do not exceed the model's max context length.

Causes & fixes

System prompt text is excessively long, consuming most of the model's context window.

✓ Fix

Shorten or simplify the system prompt to reduce token usage, focusing on essential instructions only.

Accumulated conversation history plus system prompt exceeds the model's token limit.

✓ Fix

Implement conversation history truncation or summarization to keep total tokens within limits.

Using a model with a small maximum context window for a large prompt.

✓ Fix

Switch to a model with a larger context window, such as gpt-4o or gemini-2.5-pro.

Code: broken vs fixed

Broken - triggers the error

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

system_prompt = """Very long system prompt text that exceeds the model's context window..."""
user_message = "Hello, how are you?"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_message}
    ]
)  # This line triggers ContextLengthExceededError
print(response.choices[0].message.content)

Fixed - works correctly

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

# Shortened system prompt to fit context window
system_prompt = "Please answer concisely and clearly."
user_message = "Hello, how are you?"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_message}
    ]
)  # Fixed: system prompt shortened to avoid overflow
print(response.choices[0].message.content)

Shortened the system prompt to reduce total tokens and prevent exceeding the model's maximum context window size.

⚠

Workaround

Catch ContextLengthExceededError and programmatically truncate or summarize the system prompt or conversation history before retrying the request.

✓

Prevention

Use token counting libraries like tiktoken to monitor prompt length dynamically and enforce limits before sending requests to the API.

Python 3.9+ · openai >=1.0.0 · tested on 1.5.x

Verified 2026-04 · gpt-4o-mini, gemini-2.5-pro

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.