How to build multi-turn conversation with Responses API
Quick answer
Use the
Responses API by maintaining a list of message objects representing the conversation history and passing it in the messages parameter for each chat.completions.create call. Append each user and assistant message to preserve context across turns.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official openai Python package version 1.0 or higher and set your API key as an environment variable.
- Install package:
pip install openai>=1.0 - Set environment variable:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai>=1.0 Step by step
Maintain a messages list that stores the full conversation history. For each user input, append a {"role": "user", "content": ...} message, then call client.chat.completions.create with the updated messages. Append the assistant's reply to messages to preserve context for the next turn.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [
{"role": "system", "content": "You are a helpful assistant."}
]
while True:
user_input = input("User: ")
if user_input.lower() in ["exit", "quit"]:
print("Exiting conversation.")
break
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
assistant_reply = response.choices[0].message.content
print(f"Assistant: {assistant_reply}")
messages.append({"role": "assistant", "content": assistant_reply}) output
User: Hello Assistant: Hello! How can I assist you today? User: What's the weather like? Assistant: I don't have real-time weather data, but I can help you find a forecast online.
Common variations
- Async usage: Use
asyncfunctions andawait client.chat.completions.create(...)for asynchronous calls. - Streaming responses: Pass
stream=Trueto receive tokens incrementally. - Different models: Swap
model="gpt-4o-mini"with other supported models likegpt-4o-miniorclaude-3-5-sonnet-20241022.
import os
import asyncio
from openai import OpenAI
async def async_chat():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "system", "content": "You are a helpful assistant."}]
while True:
user_input = input("User: ")
if user_input.lower() in ["exit", "quit"]:
print("Exiting conversation.")
break
messages.append({"role": "user", "content": user_input})
stream = await client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
stream=True
)
print("Assistant: ", end="", flush=True)
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
print()
# Append full assistant message after streaming
# For simplicity, here we do not reconstruct full message from stream
# In production, accumulate chunks to append
if __name__ == "__main__":
asyncio.run(async_chat()) output
User: Hi Assistant: Hello! How can I help you today? User: Tell me a joke. Assistant: Why did the scarecrow win an award? Because he was outstanding in his field!
Troubleshooting
- If you get
401 Unauthorized, verify yourOPENAI_API_KEYenvironment variable is set correctly. - If context is lost, ensure you append both user and assistant messages to the
messageslist each turn. - For rate limits, handle
429errors by retrying with exponential backoff.
Key Takeaways
- Maintain a full
messageslist including system, user, and assistant roles to preserve conversation context. - Use
client.chat.completions.createwith updated messages for each turn to build multi-turn dialogs. - Support async and streaming modes for responsive and scalable conversation handling.