High severity intermediate · Fix: 2-5 min

ValueError

ValueError: top_n parameter exceeds number of retrieved documents

What this error means
A reranker received a top_n parameter larger than the number of documents to rerank, causing it to request more results than exist.

Stack trace

traceback
Traceback (most recent call last):
  File "app.py", line 42, in rerank_results
    reranked = reranker.rerank(documents, query, top_n=top_n)
  File "/site-packages/sentence_transformers/cross_encoders/CrossEncoder.py", line 156, in rank
    raise ValueError(f'top_n ({top_n}) cannot be greater than number of documents ({len(documents)})')
ValueError: top_n (10) cannot be greater than number of documents (3)
QUICK FIX
Wrap top_n in min(): `top_n = min(top_n_requested, len(retrieved_documents))` before calling reranker.rerank().

Why it happens

Reranking pipelines accept a top_n parameter to return only the best N documents after scoring. When the vector search retrieves fewer documents than top_n requests (due to small result sets, aggressive filtering, or hard limits in the query), the reranker cannot return more documents than it received. This mismatch between expected and actual document count causes the error. It's common in RAG systems where retrieve-then-rerank pipelines don't validate document counts between stages.

Detection

Log the document count before reranking: `print(f'Retrieved {len(docs)} docs, requesting top_n={top_n}')`. Add assertions: `assert top_n <= len(docs), f'top_n {top_n} > {len(docs)} docs'`. Monitor reranker input/output sizes in observability platforms to catch this in staging.

Causes & fixes

1

top_n parameter is hardcoded larger than the minimum possible retrieval result set

✓ Fix

Set top_n dynamically: `top_n = min(requested_top_n, len(retrieved_docs))` before passing to reranker

2

Vector search retrieves fewer results than expected due to aggressive pre-filters or low k parameter

✓ Fix

Increase vector search k parameter and adjust filters: `retriever.get_relevant_documents(query, k=20)` before reranking with `top_n=10`

3

Reranker receives empty or single-document list when top_n expects 5+ results

✓ Fix

Add a guard: `if len(docs) < top_n: top_n = len(docs)` or return docs as-is if count is too low

4

Pipeline uses fixed top_n for all queries without checking batch document counts

✓ Fix

Validate per-batch: `top_n = min(top_n, min(len(batch) for batch in batches))` in batch reranking loops

Code: broken vs fixed

Broken - triggers the error
python
import os
from sentence_transformers import CrossEncoder
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# Initialize retriever and reranker
embeddings = OpenAIEmbeddings(api_key=os.environ['OPENAI_API_KEY'])
vector_store = FAISS.load_local('faiss_index', embeddings)
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

query = 'What is quantum computing?'
top_n = 10  # Fixed value — THIS WILL FAIL if retrieval returns < 10 docs

# Retrieve documents
retrieved_docs = vector_store.similarity_search(query, k=5)  # Only 5 docs returned

# Rerank — ERROR: top_n (10) > len(retrieved_docs) (5)
scores = reranker.rank(query, [doc.page_content for doc in retrieved_docs], top_n=top_n)
print(f'Top {top_n} reranked docs: {scores[:top_n]}')
Fixed - works correctly
python
import os
from sentence_transformers import CrossEncoder
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# Initialize retriever and reranker
embeddings = OpenAIEmbeddings(api_key=os.environ['OPENAI_API_KEY'])
vector_store = FAISS.load_local('faiss_index', embeddings)
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

query = 'What is quantum computing?'
requested_top_n = 10

# Retrieve documents
retrieved_docs = vector_store.similarity_search(query, k=5)  # Only 5 docs returned

# FIX: Cap top_n to actual document count
top_n = min(requested_top_n, len(retrieved_docs))  # top_n becomes 5

# Rerank — now safe, will return min(5, 5) = 5 docs
scores = reranker.rank(query, [doc.page_content for doc in retrieved_docs], top_n=top_n)
print(f'Top {top_n} reranked docs: {scores[:top_n]}')
Capped top_n to the actual count of retrieved documents using min() before passing to reranker, preventing the mismatch error.

Workaround

If you cannot modify the reranking call, catch the ValueError and fall back to returning all retrieved documents sorted by vector similarity score: `try: ranked = reranker.rank(..., top_n=top_n) except ValueError: ranked = sorted([(i, score) for i, score in enumerate(similarity_scores)], key=lambda x: x[1], reverse=True)[:len(docs)]`.

Prevention

Design retrieval pipelines with validation at each stage: (1) retrieve k docs, (2) validate k >= requested top_n, (3) pass min(k, top_n) to reranker. Use a wrapper class that enforces this: `class SafeReranker: def rank(self, docs, query, top_n): top_n = min(top_n, len(docs)); return self.reranker.rank(...)`. For production RAG, implement observability that tracks document counts through the pipeline and alerts when top_n > retrieved count.

Python 3.9+ · sentence-transformers >=2.2.0 · tested on 2.8.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.