How to beginner · 3 min read

How to add memory to OpenAI chatbot

Quick answer
To add memory to an OpenAI chatbot, store previous conversation messages in a list and include them in the messages parameter on each API call. This preserves context, enabling the chatbot to remember past interactions and respond accordingly.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

  • Run pip install openai to install the SDK.
  • Set your API key in your shell: export OPENAI_API_KEY='your_api_key_here' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key_here" (Windows).
bash
pip install openai

Step by step

Use a list to keep track of the conversation history and pass it to the messages parameter on each request. This example shows a simple memory implementation that appends user and assistant messages.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize conversation memory
conversation = [
    {"role": "system", "content": "You are a helpful assistant."}
]

def chat_with_memory(user_input):
    # Append user message to memory
    conversation.append({"role": "user", "content": user_input})

    # Call OpenAI chat completion with full conversation
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=conversation
    )

    # Extract assistant reply
    assistant_message = response.choices[0].message.content

    # Append assistant reply to memory
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

# Example usage
if __name__ == "__main__":
    print(chat_with_memory("Hello, who won the world series in 2020?"))
    print(chat_with_memory("Where was it played?"))
output
The Los Angeles Dodgers won the World Series in 2020.
The 2020 World Series was played at Globe Life Field in Arlington, Texas.

Common variations

  • Async calls: Use asyncio with the OpenAI SDK's async client methods.
  • Streaming responses: Use the stream=True parameter to receive tokens as they arrive.
  • Memory management: Limit conversation length by trimming older messages or summarizing context to stay within token limits.
  • Different models: Swap model="gpt-4o" with other supported models like gpt-4o-mini for cost or speed tradeoffs.

Troubleshooting

  • If you get token limit exceeded errors, trim or summarize older conversation messages before sending.
  • If the chatbot forgets context, ensure you append both user and assistant messages to the conversation list.
  • Check your API key environment variable if authentication fails.

Key Takeaways

  • Store conversation history in a list and send it with each API call to maintain memory.
  • Trim or summarize conversation history to avoid exceeding token limits.
  • Use the official OpenAI Python SDK v1 with environment variable API keys for secure integration.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗