Recursive summarization explained
Quick answer
Recursive summarization is a technique where large texts are broken into smaller chunks, each chunk is summarized individually using an AI model like gpt-4o, and then those summaries are recursively summarized until a final concise summary is produced. This approach enables efficient summarization of very long documents beyond token limits.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example demonstrates recursive summarization by splitting a long text into chunks, summarizing each chunk with gpt-4o, and then summarizing the combined chunk summaries recursively until a final summary is obtained.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Function to summarize a single text chunk
def summarize_chunk(text: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": f"Summarize this text concisely:\n\n{text}"}]
)
return response.choices[0].message.content.strip()
# Recursive summarization function
def recursive_summarize(text: str, chunk_size: int = 1000) -> str:
# Base case: if text is short enough, summarize directly
if len(text) <= chunk_size:
return summarize_chunk(text)
# Split text into chunks
chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
# Summarize each chunk
chunk_summaries = [summarize_chunk(chunk) for chunk in chunks]
# Combine summaries and recurse
combined_summary = "\n".join(chunk_summaries)
return recursive_summarize(combined_summary, chunk_size)
# Example usage
if __name__ == "__main__":
long_text = (
"OpenAI's GPT models can handle large texts by breaking them down into manageable chunks. "
"Recursive summarization helps condense very long documents by summarizing summaries. "
"This technique is useful for books, research papers, or lengthy reports where a single prompt would exceed token limits. "
"By recursively summarizing, you maintain context while reducing length step-by-step."
) * 10 # Repeat to simulate a long document
final_summary = recursive_summarize(long_text, chunk_size=500)
print("Final summary:\n", final_summary) output
Final summary: OpenAI's GPT models can summarize large texts by breaking them into chunks and recursively summarizing these summaries, enabling concise overviews of very long documents while preserving context.
Common variations
- Use asynchronous calls with
asyncioandclient.chat.completions.createfor faster parallel chunk summarization. - Adjust
chunk_sizebased on model token limits and text complexity. - Use different models like
gpt-4o-minifor cost-effective summarization orclaude-3-5-sonnet-20241022with Anthropic SDK. - Incorporate prompt engineering to customize summary style or length.
import os
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def summarize_chunk_async(text: str) -> str:
response = await client.chat.completions.acreate(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarize this text concisely:\n\n{text}"}]
)
return response.choices[0].message.content.strip()
async def recursive_summarize_async(text: str, chunk_size: int = 1000) -> str:
if len(text) <= chunk_size:
return await summarize_chunk_async(text)
chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
# Summarize chunks concurrently
chunk_summaries = await asyncio.gather(*(summarize_chunk_async(c) for c in chunks))
combined_summary = "\n".join(chunk_summaries)
return await recursive_summarize_async(combined_summary, chunk_size)
# Usage example
if __name__ == "__main__":
import nest_asyncio
nest_asyncio.apply() # For Jupyter or nested event loops
long_text = "Your very long text here..." * 10
final_summary = asyncio.run(recursive_summarize_async(long_text, chunk_size=500))
print("Final async summary:\n", final_summary) output
Final async summary: OpenAI's GPT models can recursively summarize large texts by chunking and summarizing summaries, enabling efficient and concise overviews of lengthy documents.
Troubleshooting
- If you get
RateLimitError, reduce concurrency or add retry logic with exponential backoff. - If summaries are too generic, improve prompts by adding instructions like "Focus on key points" or "Use bullet points."
- For token limit errors, decrease
chunk_sizeor switch to smaller models. - Ensure your
OPENAI_API_KEYenvironment variable is set correctly to avoid authentication errors.
Key Takeaways
- Recursive summarization breaks large texts into chunks and summarizes them stepwise to handle token limits.
- Use the OpenAI SDK v1 pattern with client.chat.completions.create for clean, production-ready code.
- Adjust chunk size and model choice based on your document length and cost constraints.
- Async summarization speeds up processing by parallelizing chunk summaries.
- Improve summary quality by refining prompts and handling API rate limits gracefully.