ChatMessageHistory: storing messages manually
Why this matters
Most real applications need to persist and retrieve conversation history from databases, file systems, or custom stores: automatic memory abstractions hide this. Understanding manual history construction lets you build production systems where you control exactly when, where, and how messages are stored.
Explanation
What it is: ChatMessageHistory is the practice of manually creating and maintaining lists of HumanMessage, AIMessage, and SystemMessage objects to represent a conversation. Instead of a framework managing history for you, you construct the message list explicitly and pass it to your chain.
How it works: Each message object has a content field (the text) and a type. You build a Python list, append new messages as the conversation progresses, then pass the entire list into your prompt template or chain via the invoke() method. The LLM sees the full conversation context, responds, and you append its response to the list manually.
When to use it: Use this when you need granular control over persistence: especially when messages are stored in a database, Redis, file system, or custom backend. This pattern is foundational for any application that survives beyond a single request.
Analogy
Think of a notebook where you write down every line of a conversation. Each time someone speaks, you write their message down. Before asking the AI a question, you read the entire notebook aloud so it knows the context. After it responds, you write that down too. If the conversation ends and you close the notebook, you have to manually open it again next time and read everything to remember where you were.
Code
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
message_history = [
SystemMessage(content="You are a helpful assistant that explains programming concepts clearly.")
]
first_user_input = "What is a closure in Python?"
message_history.append(HumanMessage(content=first_user_input))
response_1 = llm.invoke(message_history)
message_history.append(AIMessage(content=response_1.content))
print("First turn:")
print(f"User: {first_user_input}")
print(f"AI: {response_1.content}")
print()
second_user_input = "Can you give me a concrete example with a counter function?"
message_history.append(HumanMessage(content=second_user_input))
response_2 = llm.invoke(message_history)
message_history.append(AIMessage(content=response_2.content))
print("Second turn:")
print(f"User: {second_user_input}")
print(f"AI: {response_2.content}")
print()
print(f"Total messages in history: {len(message_history)}")
for i, msg in enumerate(message_history):
print(f" {i}: {msg.__class__.__name__} - {msg.content[:50]}..." if len(msg.content) > 50 else f" {i}: {msg.__class__.__name__} - {msg.content}") First turn:
User: What is a closure in Python?
AI: A closure in Python is a function that "remembers" variables from the scope in which it was created, even after that scope has finished executing. It's a function object that has access to variables in its enclosing scope through the process of lexical scoping.
Second turn:
User: Can you give me a concrete example with a counter function?
AI: Here's a simple counter example:
```python
def make_counter():
count = 0
def increment():
nonlocal count
count += 1
return count
return increment
counter = make_counter()
print(counter()) # Output: 1
print(counter()) # Output: 2
print(counter()) # Output: 3
```
In this example, the `increment()` function is a closure. It "remembers" the `count` variable from `make_counter()`'s scope. Each time you call `counter()`, it increments and returns the stored `count` value. The variable persists between calls because it's part of the closure.
Total messages in history: 5
0: SystemMessage - You are a helpful assistant that explains programming concepts clearly.
1: HumanMessage - What is a closure in Python?
2: AIMessage - A closure in Python is a function that "remembers" variables from the scope in which it was created, even after that scope has finished executing. It's a function object that has access to variables in its enclosing scope through the process of lexical scoping.
3: HumanMessage - Can you give me a concrete example with a counter function?
4: AIMessage - Here's a simple counter example:
```python
def make_counter():
count = 0
def increment():
nonlocal count
count += 1
return count
return increment
counter = make_counter()
print(counter()) # Output: 1
print(counter()) # Output: 2
print(counter()) # Output: 3
```
In this example, the `increment()` function is a closure. It "remembers" the `count` variable from `make_counter()`'s scope. Each time you call `counter()`, it increments and returns the stored `count` value. The variable persists between calls because it's part of the closure. What just happened?
The code created a message history list starting with a system message. It then simulated a two-turn conversation: (1) appended a human message, invoked the LLM with the full history, appended the AI response; (2) repeated with a follow-up question. The LLM saw all prior messages and maintained context. At the end, the history contained 5 messages total: the system prompt plus two turns of human-AI exchanges.
Common gotcha
Developers often forget that you must append the AI's response back to the message history, or the next turn will have no memory of what the AI just said. If you only append human messages, the AI repeats itself or loses context because it never 'sees' its own prior outputs in the history list.
Error recovery
KeyError when passing to chainTypeError: object is not iterableLLM ignores prior contextExperienced dev note
In production, separate the in-memory message list from persistence. Load history from your database into the list, process the turn, then save only new messages back to storage: not the entire history every time. This saves latency and database writes. Also, implement a message retention policy (e.g., keep only the last N turns) to avoid token limits and cost creep as conversations grow old.
Check your understanding
If a user sends a third message but you only append their message without invoking the LLM and appending the response, what will happen on the fourth user message, and why?
Show answer hint
The answer requires understanding that the LLM only 'sees' what is in the message_history list at invocation time. If you skip appending the AI response after turn 3, the list never contains that response, so on turn 4 the LLM has no record of what it said in turn 3, breaking context continuity. The key insight is that the history list is the entire universe of context the LLM has access to.