High severity intermediate · Fix: 2-5 min

ConversationTokenBufferMemoryError

langchain.memory.conversation_token_buffer_memory.ConversationTokenBufferMemoryError

What this error means

The ConversationTokenBufferMemory exceeded its token limit, causing truncation or failure to store full conversation history.

Stack trace

traceback

langchain.memory.conversation_token_buffer_memory.ConversationTokenBufferMemoryError: Token limit exceeded in ConversationTokenBufferMemory. Cannot add new messages without exceeding max token count.

QUICK FIX

Increase max_token_limit in ConversationTokenBufferMemory or enable summarization to keep token count within limits.

Why it happens

ConversationTokenBufferMemory stores conversation history up to a maximum token count to fit within LLM context limits. When the accumulated tokens exceed this limit, the memory buffer cannot add new messages without truncating or dropping older messages, triggering this error.

Detection

Monitor token usage in ConversationTokenBufferMemory by logging current token count before adding new messages; catch ConversationTokenBufferMemoryError to detect when limits are exceeded.

Causes & fixes

The max_token_limit parameter is set too low for the conversation length and LLM context size.

✓ Fix

Increase the max_token_limit parameter in ConversationTokenBufferMemory to accommodate longer conversations within the LLM's context window.

Conversation history grows without pruning or summarization, exceeding token limits.

✓ Fix

Implement summarization or pruning strategies to reduce stored conversation tokens before adding new messages.

Using a large LLM context window but forgetting to adjust ConversationTokenBufferMemory token limits accordingly.

✓ Fix

Align ConversationTokenBufferMemory max_token_limit with the LLM model's maximum context tokens to fully utilize available context.

Code: broken vs fixed

Broken - triggers the error

python

from langchain.memory import ConversationTokenBufferMemory

memory = ConversationTokenBufferMemory(max_token_limit=100)  # Too low for conversation
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'})  # Raises ConversationTokenBufferMemoryError

Fixed - works correctly

python

import os
from langchain.memory import ConversationTokenBufferMemory

memory = ConversationTokenBufferMemory(max_token_limit=1500)  # Increased token limit to fit conversation
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'})  # Works without error
print('Memory saved successfully')

Increased max_token_limit to prevent token overflow errors when saving conversation context.

⚠

Workaround

Catch ConversationTokenBufferMemoryError and manually prune or summarize older messages before retrying to add new conversation entries.

✓

Prevention

Design conversation memory management to dynamically adjust token limits based on LLM context size and implement automatic summarization or pruning to keep token usage within bounds.

Python 3.9+ · langchain-core >=0.1.0 · tested on 0.2.x

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.