Small-to-big chunking explained
Quick answer
Small-to-big chunking is a technique where data is first split into small chunks and then progressively merged into larger chunks to optimize AI model context usage. This approach balances granularity and context size, enabling efficient processing with models like gpt-4o.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable for secure access.
pip install openai>=1.0 Step by step
This example demonstrates small-to-big chunking by first splitting text into small chunks, then merging them into bigger chunks before sending to the gpt-4o model for summarization.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample text to chunk
text = (
"Artificial intelligence is transforming industries by enabling new capabilities. "
"However, large documents often exceed model context limits. "
"Small-to-big chunking helps manage this by splitting and merging text efficiently."
)
# Step 1: Small chunking (split by sentences)
small_chunks = text.split('. ')
# Step 2: Merge small chunks into bigger chunks (combine 2 sentences each)
big_chunks = []
chunk_size = 2
for i in range(0, len(small_chunks), chunk_size):
merged = '. '.join(small_chunks[i:i+chunk_size])
if not merged.endswith('.'): # Ensure punctuation
merged += '.'
big_chunks.append(merged)
# Step 3: Process each big chunk with the model
for idx, chunk in enumerate(big_chunks, 1):
messages = [{"role": "user", "content": f"Summarize this: {chunk}"}]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
summary = response.choices[0].message.content
print(f"Chunk {idx} summary:\n{summary}\n") output
Chunk 1 summary: Artificial intelligence is transforming industries by enabling new capabilities. Chunk 2 summary: Large documents can exceed model context limits, so small-to-big chunking manages this by splitting and merging text efficiently.
Common variations
- Use async calls with
asyncioandclient.chat.completions.acreate()for concurrency. - Adjust chunk sizes dynamically based on token counts using tokenizer libraries.
- Apply small-to-big chunking with other models like
claude-3-5-sonnet-20241022by changing themodelparameter.
Troubleshooting
- If you get context length errors, reduce the big chunk size or split more granularly.
- Ensure your API key is set correctly in
os.environ["OPENAI_API_KEY"]. - Check network connectivity if requests time out.
Key Takeaways
- Small-to-big chunking balances detail and context size for efficient AI processing.
- Start with small chunks and merge progressively to fit model context limits.
- Adjust chunk sizes dynamically based on token count for best results.