How to beginner · 3 min read

Fix agent losing context across turns

Quick answer

To fix an agent losing context across turns, persist and pass the full conversation history in the messages array for each API call. Use memory management techniques like summarization or external storage to maintain context efficiently across multiple turns.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable.

Install OpenAI SDK v1+: pip install openai --upgrade
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai --upgrade

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

Maintain full conversation history by appending each user and assistant message to a messages list and passing it on every API call. This preserves context across turns.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize conversation history
messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

# User sends first message
user_input = "Hello, who won the world series in 2020?"
messages.append({"role": "user", "content": user_input})

# Call chat completion
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
assistant_reply = response.choices[0].message.content
print("Assistant:", assistant_reply)

# Append assistant reply to history
messages.append({"role": "assistant", "content": assistant_reply})

# Next user turn
user_input = "Where was it played?"
messages.append({"role": "user", "content": user_input})

# Call chat completion again with full history
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
assistant_reply = response.choices[0].message.content
print("Assistant:", assistant_reply)

# Append assistant reply to history
messages.append({"role": "assistant", "content": assistant_reply})

output

Assistant: The Los Angeles Dodgers won the World Series in 2020.
Assistant: The 2020 World Series was played at Globe Life Field in Arlington, Texas.

Common variations

You can optimize context management by summarizing long histories or storing conversation state externally (e.g., in a database). For async usage, use async functions with the OpenAI SDK. Different models like gpt-4o-mini can be used similarly.

python

import asyncio
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def chat():
    messages = [{"role": "system", "content": "You are a helpful assistant."}]
    messages.append({"role": "user", "content": "Hello!"})

    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=messages
    )
    reply = response.choices[0].message.content
    print("Assistant:", reply)

asyncio.run(chat())

output

Assistant: Hello! How can I assist you today?

Troubleshooting

If context seems lost, verify you are passing the entire messages list including all prior turns on every API call.
For very long conversations, truncate or summarize older messages to stay within token limits.
Check for accidental overwrites of the messages list between turns.

✅

Key Takeaways

Always pass the full conversation history in the messages parameter to maintain context.
Use summarization or external storage to manage long conversations within token limits.
Avoid resetting or overwriting the messages list between turns to prevent losing context.

Verified 2026-04 · gpt-4o-mini

Verify ↗