How to intermediate · 4 min read

Chunking strategies for summarization

Quick answer
Use chunking to split large texts into manageable pieces before summarization with AI models. Common strategies include fixed-size chunks, semantic chunking by paragraphs or sentences, and overlap chunks to preserve context for better summary quality.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable for secure access.

bash
pip install openai
output
Collecting openai
  Downloading openai-1.0.0-py3-none-any.whl (50 kB)
Installing collected packages: openai
Successfully installed openai-1.0.0

Step by step

This example demonstrates chunking a long text into fixed-size overlapping chunks and summarizing each chunk using the gpt-4o model from OpenAI.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample long text
long_text = (
    "Artificial intelligence (AI) is transforming industries by enabling machines to learn from data, "
    "make decisions, and perform tasks that typically require human intelligence. However, large documents "
    "can exceed model input limits, so chunking is essential for effective summarization. "
    "Chunking strategies include fixed-size chunks, semantic chunks by paragraphs or sentences, "
    "and overlapping chunks to maintain context continuity."
)

# Chunking parameters
chunk_size = 100  # characters
overlap = 20      # characters

# Function to create overlapping chunks

def chunk_text(text, size, overlap):
    chunks = []
    start = 0
    while start < len(text):
        end = min(start + size, len(text))
        chunk = text[start:end]
        chunks.append(chunk)
        start += size - overlap
    return chunks

chunks = chunk_text(long_text, chunk_size, overlap)

summaries = []
for i, chunk in enumerate(chunks):
    messages = [
        {"role": "user", "content": f"Summarize this text chunk:\n{chunk}"}
    ]
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    summary = response.choices[0].message.content.strip()
    print(f"Chunk {i+1} summary:\n{summary}\n")
    summaries.append(summary)

# Optionally, combine chunk summaries into a final summary
final_prompt = "Summarize the following summaries into a concise overall summary:\n" + "\n".join(summaries)
final_response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": final_prompt}]
)
final_summary = final_response.choices[0].message.content.strip()
print("Final combined summary:\n", final_summary)
output
Chunk 1 summary:
AI transforms industries by enabling machines to learn and make decisions. Chunking helps summarize large documents effectively.

Chunk 2 summary:
Common chunking methods include fixed-size, semantic, and overlapping chunks to preserve context.

Final combined summary:
AI revolutionizes industries by enabling intelligent machines. Effective summarization of large texts requires chunking strategies like fixed-size, semantic, and overlapping chunks to maintain context.

Common variations

  • Use semantic chunking by splitting text on paragraphs or sentences for better context preservation.
  • Implement asynchronous calls with asyncio for faster processing of multiple chunks.
  • Try different models like gpt-4o-mini for cost-effective summarization or claude-3-5-sonnet-20241022 for alternative APIs.
python
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def summarize_chunk(chunk):
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Summarize this chunk:\n{chunk}"}]
    )
    return response.choices[0].message.content.strip()

async def main():
    chunks = ["First chunk text.", "Second chunk text.", "Third chunk text."]
    summaries = await asyncio.gather(*(summarize_chunk(c) for c in chunks))
    print("Summaries:", summaries)

asyncio.run(main())
output
Summaries: ['Summary of first chunk.', 'Summary of second chunk.', 'Summary of third chunk.']

Troubleshooting

  • If you hit token limits, reduce chunk size or increase overlap cautiously.
  • For incomplete summaries, ensure chunks have enough context by overlapping or semantic chunking.
  • If API calls fail, verify your OPENAI_API_KEY environment variable is set correctly.

Key Takeaways

  • Chunk large texts into overlapping or semantic chunks to fit model input limits and preserve context.
  • Use asynchronous API calls to speed up summarization of multiple chunks.
  • Combine individual chunk summaries for a coherent overall summary.
  • Adjust chunk size and overlap based on token limits and context needs.
  • Always secure your API key via environment variables to avoid leaks.
Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗