ConversationTokenBufferMemoryError
langchain.memory.conversation_token_buffer_memory.ConversationTokenBufferMemoryError
Stack trace
langchain.memory.conversation_token_buffer_memory.ConversationTokenBufferMemoryError: Token limit exceeded in ConversationTokenBufferMemory. Cannot add new messages without exceeding max token count.
Why it happens
ConversationTokenBufferMemory stores conversation history up to a maximum token count to fit within LLM context limits. When the accumulated tokens exceed this limit, the memory buffer cannot add new messages without truncating or dropping older messages, triggering this error.
Detection
Monitor token usage in ConversationTokenBufferMemory by logging current token count before adding new messages; catch ConversationTokenBufferMemoryError to detect when limits are exceeded.
Causes & fixes
The max_token_limit parameter is set too low for the conversation length and LLM context size.
Increase the max_token_limit parameter in ConversationTokenBufferMemory to accommodate longer conversations within the LLM's context window.
Conversation history grows without pruning or summarization, exceeding token limits.
Implement summarization or pruning strategies to reduce stored conversation tokens before adding new messages.
Using a large LLM context window but forgetting to adjust ConversationTokenBufferMemory token limits accordingly.
Align ConversationTokenBufferMemory max_token_limit with the LLM model's maximum context tokens to fully utilize available context.
Code: broken vs fixed
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(max_token_limit=100) # Too low for conversation
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'}) # Raises ConversationTokenBufferMemoryError import os
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(max_token_limit=1500) # Increased token limit to fit conversation
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'}) # Works without error
print('Memory saved successfully') Workaround
Catch ConversationTokenBufferMemoryError and manually prune or summarize older messages before retrying to add new conversation entries.
Prevention
Design conversation memory management to dynamically adjust token limits based on LLM context size and implement automatic summarization or pruning to keep token usage within bounds.