How to beginner · 3 min read

How to chain Responses API calls

Quick answer
To chain multiple OpenAI chat.completions.create calls, capture the output from one response and include it as context in the next call's messages array. This lets you build multi-turn conversations or workflows by passing prior responses as input to subsequent requests.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

  • Install SDK: pip install openai
  • Set environment variable in your shell:
    export OPENAI_API_KEY='your_api_key_here'
bash
pip install openai

Step by step

This example shows how to chain two chat.completions.create calls. The first call asks a question, and the second call uses the first response as context to ask a follow-up.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# First API call
messages1 = [{"role": "user", "content": "What is the capital of France?"}]
response1 = client.chat.completions.create(model="gpt-4o", messages=messages1)
answer1 = response1.choices[0].message.content
print("Answer 1:", answer1)

# Chain second API call using first response
messages2 = [
    {"role": "user", "content": "What is the population of that city?"},
    {"role": "assistant", "content": answer1}
]
response2 = client.chat.completions.create(model="gpt-4o", messages=messages2)
answer2 = response2.choices[0].message.content
print("Answer 2:", answer2)
output
Answer 1: Paris is the capital of France.
Answer 2: The population of Paris is approximately 2.1 million people.

Common variations

You can chain calls asynchronously using async and await with the OpenAI SDK. Also, you can use different models like gpt-4o-mini or claude-3-5-sonnet-20241022. For longer chains, accumulate conversation history in the messages list to maintain context.

python
import os
import asyncio
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    messages = [{"role": "user", "content": "Who wrote 'Pride and Prejudice'?"}]
    response = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
    answer = response.choices[0].message.content
    print("Answer:", answer)

    # Chain next call with previous answer
    messages.append({"role": "assistant", "content": answer})
    messages.append({"role": "user", "content": "When was it published?"})

    response2 = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
    print("Follow-up answer:", response2.choices[0].message.content)

asyncio.run(main())
output
Answer: Jane Austen wrote 'Pride and Prejudice'.
Follow-up answer: It was published in 1813.

Troubleshooting

  • If responses seem unrelated, ensure you pass the full conversation history in the messages array to maintain context.
  • If you get rate limit errors, add retries with exponential backoff.
  • Check your API key environment variable is set correctly to avoid authentication errors.

Key Takeaways

  • Chain responses by passing prior outputs as messages in subsequent API calls.
  • Maintain full conversation history in the messages list to preserve context.
  • Use async calls for efficient multi-step workflows.
  • Always set your API key via environment variables for security.
  • Handle rate limits and errors gracefully in production.
Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗