How to beginner · 3 min read

Async summarization pipeline

Quick answer
Use the OpenAI Python SDK with async support to create an asynchronous summarization pipeline by calling client.chat.completions.acreate with stream=True. This enables non-blocking streaming of summary tokens for efficient processing of large texts.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official openai Python package version 1.0 or higher and set your OpenAI API key as an environment variable.

  • Install package: pip install openai
  • Set environment variable in your shell: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)
bash
pip install openai
output
Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example demonstrates an asynchronous summarization pipeline using the OpenAI SDK's async client and streaming. It sends a long text to the gpt-4o model with a prompt to summarize, then asynchronously receives and prints the streamed summary tokens.

python
import os
import asyncio
from openai import OpenAI

async def async_summarize(text: str) -> str:
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    prompt = f"Summarize the following text concisely:\n\n{text}"

    summary = []
    # Create async generator for streaming response
    stream = await client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta
        if delta and "content" in delta:
            summary.append(delta["content"])
            print(delta["content"], end="", flush=True)

    print()  # newline after streaming
    return ''.join(summary)

async def main():
    long_text = (
        "Artificial intelligence (AI) is transforming industries by enabling machines to learn from data, "
        "make decisions, and perform tasks that typically require human intelligence. This technology is "
        "applied in healthcare, finance, autonomous vehicles, and more, driving innovation and efficiency."
    )
    summary = await async_summarize(long_text)
    print(f"\nFinal summary:\n{summary}")

if __name__ == "__main__":
    asyncio.run(main())
output
Artificial intelligence (AI) is transforming industries by enabling machines to learn from data, make decisions, and perform tasks that typically require human intelligence. This technology is applied in healthcare, finance, autonomous vehicles, and more, driving innovation and efficiency.

Final summary:
Artificial intelligence enables machines to learn and make decisions, transforming industries like healthcare, finance, and autonomous vehicles.

Common variations

  • Use different models like gpt-4o-mini for faster, cheaper summarization.
  • Use synchronous calls with client.chat.completions.create if async is not needed.
  • Integrate with frameworks like asyncio or FastAPI for web-based streaming summarization endpoints.
  • Adjust max_tokens and temperature parameters for summary length and creativity.
python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize: AI impacts healthcare and finance."}],
    max_tokens=100,
    temperature=0.3
)
print(response.choices[0].message.content)
output
AI is revolutionizing healthcare and finance by improving diagnostics, personalizing treatments, automating processes, and enhancing decision-making.

Troubleshooting

  • If you get AuthenticationError, verify your OPENAI_API_KEY environment variable is set correctly.
  • For TimeoutError, increase network timeout or retry the request.
  • If streaming hangs, ensure your environment supports async and you are using Python 3.8+.
  • Check model availability and quota limits on your OpenAI dashboard.

Key Takeaways

  • Use the OpenAI Python SDK's async client with streaming for efficient summarization pipelines.
  • Set stream=True to receive partial summary tokens asynchronously.
  • Adjust model and parameters to balance speed, cost, and summary quality.
  • Always set your API key via environment variables to avoid authentication errors.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗