How to beginner · 3 min read

How to add memory to OpenAI chatbot

Quick answer

To add memory to an OpenAI chatbot, store previous conversation messages in a list and include them in the messages parameter on each API call. This preserves context, enabling the chatbot to remember past interactions and respond accordingly.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

Run pip install openai to install the SDK.
Set your API key in your shell: export OPENAI_API_KEY='your_api_key_here' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key_here" (Windows).

bash

pip install openai

Step by step

Use a list to keep track of the conversation history and pass it to the messages parameter on each request. This example shows a simple memory implementation that appends user and assistant messages.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize conversation memory
conversation = [
    {"role": "system", "content": "You are a helpful assistant."}
]

def chat_with_memory(user_input):
    # Append user message to memory
    conversation.append({"role": "user", "content": user_input})

    # Call OpenAI chat completion with full conversation
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=conversation
    )

    # Extract assistant reply
    assistant_message = response.choices[0].message.content

    # Append assistant reply to memory
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

# Example usage
if __name__ == "__main__":
    print(chat_with_memory("Hello, who won the world series in 2020?"))
    print(chat_with_memory("Where was it played?"))

output

The Los Angeles Dodgers won the World Series in 2020.
The 2020 World Series was played at Globe Life Field in Arlington, Texas.

Common variations

Async calls: Use asyncio with the OpenAI SDK's async client methods.
Streaming responses: Use the stream=True parameter to receive tokens as they arrive.
Memory management: Limit conversation length by trimming older messages or summarizing context to stay within token limits.
Different models: Swap model="gpt-4o" with other supported models like gpt-4o-mini for cost or speed tradeoffs.

Troubleshooting

If you get token limit exceeded errors, trim or summarize older conversation messages before sending.
If the chatbot forgets context, ensure you append both user and assistant messages to the conversation list.
Check your API key environment variable if authentication fails.

✅

Key Takeaways

Store conversation history in a list and send it with each API call to maintain memory.
Trim or summarize conversation history to avoid exceeding token limits.
Use the official OpenAI Python SDK v1 with environment variable API keys for secure integration.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗