What chunk overlap to use for RAG
Quick answer
Use a chunk overlap of 10-30% of the chunk size for
RAG workflows to maintain context continuity across chunks without excessive redundancy. For example, if your chunk size is 500 tokens, an overlap of 50-150 tokens ensures smooth retrieval and better answer quality.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable for secure access.
pip install openai>=1.0 Step by step
This example demonstrates how to split a document into chunks with 20% overlap, suitable for RAG pipelines.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def chunk_text(text, chunk_size=500, overlap=100):
chunks = []
start = 0
text_length = len(text)
while start < text_length:
end = min(start + chunk_size, text_length)
chunks.append(text[start:end])
start += chunk_size - overlap
return chunks
# Example usage
sample_text = """Your long document text goes here. It can be several thousand tokens long."""
chunks = chunk_text(sample_text, chunk_size=500, overlap=100)
print(f"Generated {len(chunks)} chunks with 20% overlap.") output
Generated X chunks with 20% overlap.
Common variations
You can adjust overlap based on document type and retrieval needs:
- Lower overlap (10%) for highly structured data like tables or code.
- Higher overlap (30%) for narrative or conversational text to preserve context.
- Use semantic chunking with embeddings to dynamically determine overlap.
| Overlap % | Use case | Effect |
|---|---|---|
| 10% | Structured data (tables, code) | Less redundancy, faster retrieval |
| 20% | General documents | Balanced context and efficiency |
| 30% | Narrative or conversational text | Better context continuity, more tokens |
Troubleshooting
If you notice poor answer quality or missing context in RAG outputs, increase the chunk overlap incrementally by 10%. Conversely, if retrieval latency or token usage is too high, reduce overlap.
Also, ensure chunk boundaries do not split sentences or semantic units to avoid context loss.
Key Takeaways
- Use 10-30% chunk overlap to balance context continuity and efficiency in RAG.
- Adjust overlap based on document type: lower for structured data, higher for narratives.
- Avoid splitting sentences or semantic units when chunking to preserve meaning.