ValueError
ValueError: top_n parameter exceeds number of retrieved documents
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in rerank_results
reranked = reranker.rerank(documents, query, top_n=top_n)
File "/site-packages/sentence_transformers/cross_encoders/CrossEncoder.py", line 156, in rank
raise ValueError(f'top_n ({top_n}) cannot be greater than number of documents ({len(documents)})')
ValueError: top_n (10) cannot be greater than number of documents (3) Why it happens
Reranking pipelines accept a top_n parameter to return only the best N documents after scoring. When the vector search retrieves fewer documents than top_n requests (due to small result sets, aggressive filtering, or hard limits in the query), the reranker cannot return more documents than it received. This mismatch between expected and actual document count causes the error. It's common in RAG systems where retrieve-then-rerank pipelines don't validate document counts between stages.
Detection
Log the document count before reranking: `print(f'Retrieved {len(docs)} docs, requesting top_n={top_n}')`. Add assertions: `assert top_n <= len(docs), f'top_n {top_n} > {len(docs)} docs'`. Monitor reranker input/output sizes in observability platforms to catch this in staging.
Causes & fixes
top_n parameter is hardcoded larger than the minimum possible retrieval result set
Set top_n dynamically: `top_n = min(requested_top_n, len(retrieved_docs))` before passing to reranker
Vector search retrieves fewer results than expected due to aggressive pre-filters or low k parameter
Increase vector search k parameter and adjust filters: `retriever.get_relevant_documents(query, k=20)` before reranking with `top_n=10`
Reranker receives empty or single-document list when top_n expects 5+ results
Add a guard: `if len(docs) < top_n: top_n = len(docs)` or return docs as-is if count is too low
Pipeline uses fixed top_n for all queries without checking batch document counts
Validate per-batch: `top_n = min(top_n, min(len(batch) for batch in batches))` in batch reranking loops
Code: broken vs fixed
import os
from sentence_transformers import CrossEncoder
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
# Initialize retriever and reranker
embeddings = OpenAIEmbeddings(api_key=os.environ['OPENAI_API_KEY'])
vector_store = FAISS.load_local('faiss_index', embeddings)
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
query = 'What is quantum computing?'
top_n = 10 # Fixed value — THIS WILL FAIL if retrieval returns < 10 docs
# Retrieve documents
retrieved_docs = vector_store.similarity_search(query, k=5) # Only 5 docs returned
# Rerank — ERROR: top_n (10) > len(retrieved_docs) (5)
scores = reranker.rank(query, [doc.page_content for doc in retrieved_docs], top_n=top_n)
print(f'Top {top_n} reranked docs: {scores[:top_n]}') import os
from sentence_transformers import CrossEncoder
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
# Initialize retriever and reranker
embeddings = OpenAIEmbeddings(api_key=os.environ['OPENAI_API_KEY'])
vector_store = FAISS.load_local('faiss_index', embeddings)
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
query = 'What is quantum computing?'
requested_top_n = 10
# Retrieve documents
retrieved_docs = vector_store.similarity_search(query, k=5) # Only 5 docs returned
# FIX: Cap top_n to actual document count
top_n = min(requested_top_n, len(retrieved_docs)) # top_n becomes 5
# Rerank — now safe, will return min(5, 5) = 5 docs
scores = reranker.rank(query, [doc.page_content for doc in retrieved_docs], top_n=top_n)
print(f'Top {top_n} reranked docs: {scores[:top_n]}') Workaround
If you cannot modify the reranking call, catch the ValueError and fall back to returning all retrieved documents sorted by vector similarity score: `try: ranked = reranker.rank(..., top_n=top_n) except ValueError: ranked = sorted([(i, score) for i, score in enumerate(similarity_scores)], key=lambda x: x[1], reverse=True)[:len(docs)]`.
Prevention
Design retrieval pipelines with validation at each stage: (1) retrieve k docs, (2) validate k >= requested top_n, (3) pass min(k, top_n) to reranker. Use a wrapper class that enforces this: `class SafeReranker: def rank(self, docs, query, top_n): top_n = min(top_n, len(docs)); return self.reranker.rank(...)`. For production RAG, implement observability that tracks document counts through the pipeline and alerts when top_n > retrieved count.