How to beginner · 3 min read

How to add memory to AI chatbot

Quick answer
Add memory to an AI chatbot by storing conversation history and passing it as context in the messages parameter of the chat completion API. Use a list to accumulate past user and assistant messages, enabling the model to maintain context across turns.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable.

bash
pip install openai

Step by step

This example shows how to implement a simple memory by accumulating conversation messages and sending them with each API call to maintain context.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize conversation history
conversation = [
    {"role": "system", "content": "You are a helpful assistant."}
]

def chat_with_memory(user_input: str) -> str:
    # Append user message to conversation
    conversation.append({"role": "user", "content": user_input})

    # Call chat completion with full conversation history
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=conversation
    )

    # Extract assistant reply
    assistant_message = response.choices[0].message.content

    # Append assistant reply to conversation
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

if __name__ == "__main__":
    print("Chatbot with memory. Type 'exit' to quit.")
    while True:
        user_text = input("You: ")
        if user_text.lower() == "exit":
            break
        reply = chat_with_memory(user_text)
        print(f"Bot: {reply}")
output
Chatbot with memory. Type 'exit' to quit.
You: Hello
Bot: Hello! How can I assist you today?
You: What's the weather like?
Bot: I don't have real-time weather data, but I can help you find a forecast if you want.

Common variations

  • Use a database or file to persist conversation history across sessions.
  • Limit memory size by truncating older messages to stay within token limits.
  • Use async calls with asyncio for scalable chatbots.
  • Switch models like gpt-4o-mini for cost-effective memory-enabled chatbots.
python
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

conversation = [
    {"role": "system", "content": "You are a helpful assistant."}
]

async def async_chat_with_memory(user_input: str) -> str:
    conversation.append({"role": "user", "content": user_input})
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=conversation
    )
    assistant_message = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": assistant_message})
    return assistant_message

async def main():
    print("Async chatbot with memory. Type 'exit' to quit.")
    while True:
        user_text = input("You: ")
        if user_text.lower() == "exit":
            break
        reply = await async_chat_with_memory(user_text)
        print(f"Bot: {reply}")

if __name__ == "__main__":
    asyncio.run(main())
output
Async chatbot with memory. Type 'exit' to quit.
You: Hi
Bot: Hello! How can I help you today?
You: Tell me a joke.
Bot: Why did the scarecrow win an award? Because he was outstanding in his field!

Troubleshooting

  • If you get context_length_exceeded errors, truncate older messages to fit token limits.
  • Ensure your OPENAI_API_KEY environment variable is set correctly.
  • Check for network issues if API calls fail.
  • Use print() debugging to verify conversation history is correctly updated.

Key Takeaways

  • Store conversation history in a list and send it with each chat completion call to maintain memory.
  • Truncate or persist conversation history externally to manage token limits and session continuity.
  • Use async API calls for scalable chatbots and switch models for cost or performance trade-offs.
Verified 2026-04 · gpt-4o-mini
Verify ↗