ValueError
langchain.text_splitter.recursive_character_text_splitter.ValueError
Stack trace
ValueError: chunk_size must be a positive integer greater than zero
File "/path/to/langchain/text_splitter/recursive_character_text_splitter.py", line 123, in __init__
raise ValueError("chunk_size must be a positive integer greater than zero")
File "app.py", line 45, in <module>
splitter = RecursiveCharacterTextSplitter(chunk_size=0, chunk_overlap=20) # triggers error Why it happens
The RecursiveCharacterTextSplitter requires chunk_size to be a positive integer to properly divide text into chunks. Setting chunk_size to zero or a negative number causes the splitter to fail because it cannot create valid chunks of text.
Detection
Validate the chunk_size parameter before initializing RecursiveCharacterTextSplitter; add assertions or input checks to ensure chunk_size > 0 to catch this error early.
Causes & fixes
chunk_size parameter is set to zero
Set chunk_size to a positive integer greater than zero, e.g., 1000 or 500, depending on your text chunking needs.
chunk_size parameter is negative or non-integer
Ensure chunk_size is a positive integer by validating input types and values before passing to RecursiveCharacterTextSplitter.
chunk_size is dynamically calculated and results in zero due to incorrect logic
Review and correct the logic that computes chunk_size to guarantee it never evaluates to zero or less.
Code: broken vs fixed
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=0, chunk_overlap=50) # triggers ValueError
chunks = splitter.split_text("Some long text to split...") import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Fixed: chunk_size set to positive integer
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
chunks = splitter.split_text("Some long text to split...")
print(f"Number of chunks: {len(chunks)}") Workaround
Wrap the RecursiveCharacterTextSplitter initialization in a try/except block catching ValueError, and fallback to a default positive chunk_size if zero or invalid values are detected.
Prevention
Always validate chunk_size inputs before passing to RecursiveCharacterTextSplitter, and consider adding input sanitization or configuration validation in your application to avoid invalid chunk sizes.