High severity intermediate · Fix: 5-10 min

ConversationHistoryTooLongError

openai.error.ConversationHistoryTooLongError

What this error means

The conversation history exceeds the model's maximum context window size, causing truncation or rejection of the input.

Stack trace

traceback

openai.error.ConversationHistoryTooLongError: The total tokens in the conversation history exceed the model's maximum context window size and cannot be processed.

QUICK FIX

Truncate or summarize conversation history to fit within the model's max token limit before sending the request.

Why it happens

LLMs have a fixed maximum context window size (token limit). When the accumulated conversation history plus the new prompt exceed this limit, the API rejects the request or truncates the input, causing loss of earlier context and errors.

Detection

Monitor token usage of conversation history before sending requests; log token counts and catch ConversationHistoryTooLongError exceptions to detect when context exceeds limits.

Causes & fixes

Accumulated conversation history tokens exceed the model's maximum context window size.

✓ Fix

Implement a sliding window or summary strategy to truncate or compress older messages before sending the request.

Including large system or user messages repeatedly without pruning.

✓ Fix

Remove or shorten redundant or less relevant messages from the conversation history before each API call.

Using a model with a smaller context window than required for the conversation length.

✓ Fix

Switch to a model with a larger context window, such as gpt-4o or gemini-2.5-pro, to accommodate longer histories.

Code: broken vs fixed

Broken - triggers the error

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    # ... very long conversation history ...
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)  # This line triggers ConversationHistoryTooLongError
print(response.choices[0].message.content)

Fixed - works correctly

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    # ... very long conversation history ...
]

# Truncate conversation history to last N messages to fit context window
MAX_TOKENS = 2048  # example limit for gpt-4o-mini

def count_tokens(messages):
    # Simplified token count approximation
    return sum(len(m['content'].split()) for m in messages) * 1.5

while count_tokens(messages) > MAX_TOKENS:
    messages.pop(1)  # remove oldest user/assistant message, keep system prompt

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)  # Fixed: truncated history to fit context window
print(response.choices[0].message.content)

Added logic to truncate conversation history by removing oldest messages until token count fits within the model's max context window, preventing the error.

⚠

Workaround

Catch ConversationHistoryTooLongError and on exception, programmatically remove oldest messages or summarize them before retrying the request.

✓

Prevention

Design conversation management to track token usage and proactively truncate or summarize history to stay within model context limits, or use models with larger context windows.

Python 3.9+ · openai >=1.0.0 · tested on 1.5.x

Verified 2026-04 · gpt-4o-mini, gemini-2.5-pro

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.