How to beginner · 3 min read

How to manage conversation with Responses API

Quick answer
Use the OpenAI Responses API by maintaining a list of messages that represent the conversation history, passing it with each chat.completions.create call. Append each user and assistant message to this list to preserve context and manage the conversation flow.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

bash
pip install openai>=1.0

Step by step

Maintain a messages list that holds the entire conversation history. Send this list with each request to client.chat.completions.create. Append the user input and assistant responses to keep context.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize conversation history
messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

# User sends a message
user_input = "Hello, how do I manage conversation state?"
messages.append({"role": "user", "content": user_input})

# Create a chat completion with full conversation history
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

# Extract assistant reply
assistant_reply = response.choices[0].message.content
print("Assistant:", assistant_reply)

# Append assistant reply to conversation
messages.append({"role": "assistant", "content": assistant_reply})
output
Assistant: To manage conversation state, keep a list of messages including all previous user and assistant exchanges, and send it with each API call to maintain context.

Common variations

  • Use different models like gpt-4o-mini for faster, cheaper responses.
  • Implement streaming by setting stream=True in the request and iterating over chunks.
  • Use async calls with an async OpenAI client for concurrency.
python
import asyncio
from openai import OpenAI

async def async_chat():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]

    # Async streaming example
    stream = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        stream=True
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)

asyncio.run(async_chat())
output
Hello! How can I assist you today?

Troubleshooting

  • If context is lost, ensure you append both user and assistant messages to the messages list before each request.
  • If you get errors about token limits, truncate older messages or summarize conversation history.
  • Check that your API key is correctly set in os.environ["OPENAI_API_KEY"].

Key Takeaways

  • Always maintain and send the full conversation messages list to preserve context.
  • Append both user inputs and assistant responses to the conversation history after each turn.
  • Use streaming and async calls for efficient and responsive conversation management.
  • Monitor token usage and truncate or summarize history to avoid hitting limits.
  • Set your API key securely via environment variables to avoid authentication errors.
Verified 2026-04 · gpt-4o-mini
Verify ↗