How to beginner · 3 min read

How to summarize conversation history

Quick answer
Use a chat model like gpt-4o to summarize conversation history by sending the full message list as input and prompting the model to generate a concise summary. This is done by passing the conversation messages to client.chat.completions.create() with a user prompt requesting a summary.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

bash
pip install openai>=1.0

Step by step

This example shows how to summarize a conversation history by sending the full chat messages to the gpt-4o model with a prompt to summarize.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

conversation_history = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hi, can you help me with my project?"},
    {"role": "assistant", "content": "Sure! What do you need help with?"},
    {"role": "user", "content": "I want to summarize our chat so far."}
]

# Add a user message prompting for a summary
messages = conversation_history + [
    {"role": "user", "content": "Please provide a concise summary of our conversation so far."}
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

summary = response.choices[0].message.content
print("Summary:\n", summary)
output
Summary:
 You asked for help with your project and want to summarize the chat so far. I am here to assist you.

Common variations

  • Use gpt-4o-mini for faster, cheaper summaries with slightly less detail.
  • Use async calls with asyncio and await client.chat.completions.create(...) for non-blocking applications.
  • Stream the summary tokens by setting stream=True to display partial results as they arrive.
python
import asyncio
from openai import OpenAI

async def async_summarize():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    conversation_history = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me about AI."}
    ]
    messages = conversation_history + [
        {"role": "user", "content": "Summarize this conversation."}
    ]
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        stream=True
    )
    async for chunk in response:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(async_summarize())
output
You asked about AI and requested a summary of the conversation.

Troubleshooting

  • If the summary is too long, reduce the conversation history or use a model with a smaller context window.
  • If you get an authentication error, verify your OPENAI_API_KEY environment variable is set correctly.
  • If the model returns irrelevant summaries, clarify the prompt to explicitly ask for a concise summary.

Key Takeaways

  • Use the full conversation messages as input to the chat completion API to summarize history.
  • Add a user prompt explicitly requesting a concise summary for best results.
  • Streaming and async calls improve responsiveness in real-time applications.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗