ValueError
builtins.ValueError
Stack trace
ValueError: Number of reranked documents (5) does not match number of input documents (10)
File "/app/rerank_pipeline.py", line 42, in rerank_documents
raise ValueError(f"Number of reranked documents ({len(reranked)}) does not match number of input documents ({len(documents)})") Why it happens
Rerankers expect the input document list and the output reranked list to have the same length, preserving document correspondence. If the reranker filters, drops, or returns fewer documents than input, this mismatch triggers the error. It often happens due to incorrect reranker implementation or unexpected filtering logic.
Detection
Add assertions after reranker output to verify the output list length matches the input documents count before further processing or downstream usage.
Causes & fixes
Reranker function filters out documents or returns fewer results than input documents.
Ensure the reranker returns a reordered list of the same length as the input documents without dropping any entries.
Input documents list is modified or truncated before reranking but the original count is used for validation.
Pass the exact list of documents to the reranker without modification or update the expected count to match the actual input list.
Incorrect unpacking or handling of reranker output, e.g., only partial results extracted.
Verify the reranker output is fully captured and returned as a complete list matching input length.
Code: broken vs fixed
def rerank_documents(documents, reranker):
reranked = reranker(documents)
# This line raises the error if counts mismatch
if len(reranked) != len(documents):
raise ValueError(f"Number of reranked documents ({len(reranked)}) does not match number of input documents ({len(documents)})")
return reranked
# Example usage
input_docs = ['doc1', 'doc2', 'doc3', 'doc4', 'doc5', 'doc6', 'doc7', 'doc8', 'doc9', 'doc10']
# Reranker that filters out some docs (incorrect)
def bad_reranker(docs):
return docs[:5] # returns fewer docs
rerank_documents(input_docs, bad_reranker) # triggers ValueError def rerank_documents(documents, reranker):
reranked = reranker(documents)
# Fixed: ensure reranker returns full list with same length
if len(reranked) != len(documents):
raise ValueError(f"Number of reranked documents ({len(reranked)}) does not match number of input documents ({len(documents)})")
return reranked
# Example usage
input_docs = ['doc1', 'doc2', 'doc3', 'doc4', 'doc5', 'doc6', 'doc7', 'doc8', 'doc9', 'doc10']
# Correct reranker that returns reordered full list
def good_reranker(docs):
# Example: reverse order but keep all docs
return list(reversed(docs))
reranked_docs = rerank_documents(input_docs, good_reranker)
print("Reranked documents count matches input count:", len(reranked_docs)) # Should print 10 Workaround
Wrap the reranker call in try/except ValueError, and if caught, log the input and output lengths and fallback to returning the original documents unmodified to avoid pipeline failure.
Prevention
Design rerankers to always return a reordered list matching the input document count exactly, and add automated tests asserting input/output length equality before deployment.