How to beginner · 3 min read

How to maintain chat history with OpenAI

Quick answer
To maintain chat history with OpenAI, keep track of the conversation messages in a list and include the entire message history in each chat.completions.create call. This preserves context and enables the model to generate coherent responses based on prior exchanges.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official openai Python package and set your API key as an environment variable.

bash
pip install openai>=1.0

Step by step

Maintain chat history by storing all messages in a list and passing them with each API call. This example shows a simple chat loop preserving conversation context.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

while True:
    user_input = input("User: ")
    if user_input.lower() in ["exit", "quit"]:
        break
    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )

    assistant_message = response.choices[0].message.content
    print(f"Assistant: {assistant_message}")

    messages.append({"role": "assistant", "content": assistant_message})
output
User: Hello
Assistant: Hello! How can I assist you today?
User: What's the weather like?
Assistant: I don't have real-time weather data, but I can help you find a forecast online.

Common variations

  • Use different models like gpt-4o-mini for faster, cheaper responses.
  • Implement asynchronous calls with asyncio and await for concurrency.
  • Stream responses by setting stream=True in chat.completions.create to receive tokens incrementally.
python
import asyncio
from openai import OpenAI

async def async_chat():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = [{"role": "system", "content": "You are a helpful assistant."}]
    user_input = "Tell me a joke."
    messages.append({"role": "user", "content": user_input})

    stream = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=messages,
        stream=True
    )

    print("Assistant: ", end="", flush=True)
    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)
    print()
output
Assistant: Why did the scarecrow win an award? Because he was outstanding in his field!

Troubleshooting

  • If the model forgets context, ensure you include the full message history in every request.
  • Trim or summarize long histories to stay within token limits.
  • Check your API key environment variable is set correctly to avoid authentication errors.

Key Takeaways

  • Always pass the full conversation message list to chat.completions.create to maintain context.
  • Use the system role message to set assistant behavior at the start of the conversation.
  • For long chats, manage token limits by trimming or summarizing history before sending.
  • Async and streaming calls improve responsiveness and scalability in chat applications.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗