How to intermediate · 4 min read

Chunking strategies for summarization

Q: Chunking strategies for summarization

Use chunking to split large texts into manageable pieces before summarization with AI models. Common strategies include fixed-size chunks, semantic chunking by paragraphs or sentences, and overlap chunks to preserve context for better summary quality.

Quick answer

Use chunking to split large texts into manageable pieces before summarization with AI models. Common strategies include fixed-size chunks, semantic chunking by paragraphs or sentences, and overlap chunks to preserve context for better summary quality.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable for secure access.

bash

pip install openai

output

Collecting openai
  Downloading openai-1.0.0-py3-none-any.whl (50 kB)
Installing collected packages: openai
Successfully installed openai-1.0.0

Step by step

This example demonstrates chunking a long text into fixed-size overlapping chunks and summarizing each chunk using the gpt-4o model from OpenAI.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample long text
long_text = (
    "Artificial intelligence (AI) is transforming industries by enabling machines to learn from data, "
    "make decisions, and perform tasks that typically require human intelligence. However, large documents "
    "can exceed model input limits, so chunking is essential for effective summarization. "
    "Chunking strategies include fixed-size chunks, semantic chunks by paragraphs or sentences, "
    "and overlapping chunks to maintain context continuity."
)

# Chunking parameters
chunk_size = 100  # characters
overlap = 20      # characters

# Function to create overlapping chunks

def chunk_text(text, size, overlap):
    chunks = []
    start = 0
    while start < len(text):
        end = min(start + size, len(text))
        chunk = text[start:end]
        chunks.append(chunk)
        start += size - overlap
    return chunks

chunks = chunk_text(long_text, chunk_size, overlap)

summaries = []
for i, chunk in enumerate(chunks):
    messages = [
        {"role": "user", "content": f"Summarize this text chunk:\n{chunk}"}
    ]
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    summary = response.choices[0].message.content.strip()
    print(f"Chunk {i+1} summary:\n{summary}\n")
    summaries.append(summary)

# Optionally, combine chunk summaries into a final summary
final_prompt = "Summarize the following summaries into a concise overall summary:\n" + "\n".join(summaries)
final_response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": final_prompt}]
)
final_summary = final_response.choices[0].message.content.strip()
print("Final combined summary:\n", final_summary)

output

Chunk 1 summary:
AI transforms industries by enabling machines to learn and make decisions. Chunking helps summarize large documents effectively.

Chunk 2 summary:
Common chunking methods include fixed-size, semantic, and overlapping chunks to preserve context.

Final combined summary:
AI revolutionizes industries by enabling intelligent machines. Effective summarization of large texts requires chunking strategies like fixed-size, semantic, and overlapping chunks to maintain context.

Common variations

Use semantic chunking by splitting text on paragraphs or sentences for better context preservation.
Implement asynchronous calls with asyncio for faster processing of multiple chunks.
Try different models like gpt-4o-mini for cost-effective summarization or claude-3-5-sonnet-20241022 for alternative APIs.

python

import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def summarize_chunk(chunk):
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Summarize this chunk:\n{chunk}"}]
    )
    return response.choices[0].message.content.strip()

async def main():
    chunks = ["First chunk text.", "Second chunk text.", "Third chunk text."]
    summaries = await asyncio.gather(*(summarize_chunk(c) for c in chunks))
    print("Summaries:", summaries)

asyncio.run(main())

output

Summaries: ['Summary of first chunk.', 'Summary of second chunk.', 'Summary of third chunk.']

Troubleshooting

If you hit token limits, reduce chunk size or increase overlap cautiously.
For incomplete summaries, ensure chunks have enough context by overlapping or semantic chunking.
If API calls fail, verify your OPENAI_API_KEY environment variable is set correctly.

✅

Key Takeaways

Chunk large texts into overlapping or semantic chunks to fit model input limits and preserve context.
Use asynchronous API calls to speed up summarization of multiple chunks.
Combine individual chunk summaries for a coherent overall summary.
Adjust chunk size and overlap based on token limits and context needs.
Always secure your API key via environment variables to avoid leaks.

Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-3-5-sonnet-20241022

Verify ↗