How to beginner · 3 min read

How to build multi-turn conversation with Responses API

Q: How to build multi-turn conversation with Responses API

Use the Responses API by maintaining a list of message objects representing the conversation history and passing it in the messages parameter for each chat.completions.create call. Append each user and assistant message to preserve context across turns.

Quick answer

Use the Responses API by maintaining a list of message objects representing the conversation history and passing it in the messages parameter for each chat.completions.create call. Append each user and assistant message to preserve context across turns.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official openai Python package version 1.0 or higher and set your API key as an environment variable.

Install package: pip install openai>=1.0
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai>=1.0

Step by step

Maintain a messages list that stores the full conversation history. For each user input, append a {"role": "user", "content": ...} message, then call client.chat.completions.create with the updated messages. Append the assistant's reply to messages to preserve context for the next turn.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

while True:
    user_input = input("User: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting conversation.")
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )

    assistant_reply = response.choices[0].message.content
    print(f"Assistant: {assistant_reply}")

    messages.append({"role": "assistant", "content": assistant_reply})

output

User: Hello
Assistant: Hello! How can I assist you today?
User: What's the weather like?
Assistant: I don't have real-time weather data, but I can help you find a forecast online.

Common variations

Async usage: Use async functions and await client.chat.completions.create(...) for asynchronous calls.
Streaming responses: Pass stream=True to receive tokens incrementally.
Different models: Swap model="gpt-4o-mini" with other supported models like gpt-4o-mini or claude-3-5-sonnet-20241022.

python

import os
import asyncio
from openai import OpenAI

async def async_chat():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = [{"role": "system", "content": "You are a helpful assistant."}]

    while True:
        user_input = input("User: ")
        if user_input.lower() in ["exit", "quit"]:
            print("Exiting conversation.")
            break

        messages.append({"role": "user", "content": user_input})

        stream = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            stream=True
        )

        print("Assistant: ", end="", flush=True)
        async for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            print(delta, end="", flush=True)

        print()
        # Append full assistant message after streaming
        # For simplicity, here we do not reconstruct full message from stream
        # In production, accumulate chunks to append

if __name__ == "__main__":
    asyncio.run(async_chat())

output

User: Hi
Assistant: Hello! How can I help you today?
User: Tell me a joke.
Assistant: Why did the scarecrow win an award? Because he was outstanding in his field!

Troubleshooting

If you get 401 Unauthorized, verify your OPENAI_API_KEY environment variable is set correctly.
If context is lost, ensure you append both user and assistant messages to the messages list each turn.
For rate limits, handle 429 errors by retrying with exponential backoff.

✅

Key Takeaways

Maintain a full messages list including system, user, and assistant roles to preserve conversation context.
Use client.chat.completions.create with updated messages for each turn to build multi-turn dialogs.
Support async and streaming modes for responsive and scalable conversation handling.

Verified 2026-04 · gpt-4o-mini, claude-3-5-sonnet-20241022

Verify ↗