How to use RRF with reranking
Quick answer
Use Reciprocal Rank Fusion (RRF) to combine multiple ranked lists by scoring each item with the formula 1 / (k + rank) and summing scores across lists for reranking. Implement RRF in Python by merging results from different retrievers or models, then sorting by the combined score to improve search relevance.
PREREQUISITES
Python 3.8+pip install numpyBasic knowledge of ranking and search resultsAccess to multiple ranked result lists (e.g., from AI retrievers or models)
Setup
Install numpy for efficient numerical operations and ensure you have Python 3.8 or newer. Prepare your environment variables if you use AI APIs for retrieval before reranking.
pip install numpy Step by step
This example demonstrates how to implement RRF reranking by combining two ranked lists of document IDs with their ranks. The code merges scores using the RRF formula and outputs the reranked list.
import numpy as np
# Reciprocal Rank Fusion function
# ranked_lists: list of lists of document IDs in ranked order
# k: constant to dampen rank influence (commonly 60)
def rrf_rerank(ranked_lists, k=60):
score_map = {}
for ranked_list in ranked_lists:
for rank, doc_id in enumerate(ranked_list, start=1):
score = 1 / (k + rank)
score_map[doc_id] = score_map.get(doc_id, 0) + score
# Sort documents by combined RRF score descending
reranked = sorted(score_map.items(), key=lambda x: x[1], reverse=True)
return [doc_id for doc_id, score in reranked]
# Example ranked lists from two retrievers
ranked_list_1 = ['doc1', 'doc2', 'doc3', 'doc4']
ranked_list_2 = ['doc3', 'doc2', 'doc5', 'doc6']
reranked_results = rrf_rerank([ranked_list_1, ranked_list_2])
print("Reranked results:", reranked_results) output
Reranked results: ['doc2', 'doc3', 'doc1', 'doc5', 'doc4', 'doc6']
Common variations
You can adjust the k parameter to control the influence of rank positions in the fusion score. Use RRF with more than two ranked lists from different AI retrievers or models. For asynchronous retrieval, gather results concurrently before applying RRF. You can also integrate RRF reranking after initial AI model scoring to boost final ranking quality.
import asyncio
async def fetch_ranked_list_1():
# Simulate async retrieval
await asyncio.sleep(0.1)
return ['doc1', 'doc2', 'doc3', 'doc4']
async def fetch_ranked_list_2():
await asyncio.sleep(0.1)
return ['doc3', 'doc2', 'doc5', 'doc6']
async def main():
results = await asyncio.gather(fetch_ranked_list_1(), fetch_ranked_list_2())
reranked = rrf_rerank(results, k=50)
print("Async reranked results:", reranked)
if __name__ == "__main__":
asyncio.run(main()) output
Async reranked results: ['doc2', 'doc3', 'doc1', 'doc5', 'doc4', 'doc6']
Troubleshooting
- If reranked results seem unchanged, verify your input ranked lists are correctly ordered by relevance.
- If some documents never appear, ensure all lists include overlapping document IDs or adjust
kto reduce rank damping. - For large result sets, optimize by using dictionaries and numpy arrays for faster score aggregation.
Key Takeaways
- Implement RRF by summing reciprocal rank scores across multiple ranked lists to improve search relevance.
- Adjust the damping parameter k to tune the influence of rank positions in reranking.
- Combine RRF with asynchronous retrieval from multiple AI models for efficient reranking.
- Ensure input ranked lists are correctly ordered and contain overlapping documents for effective fusion.
- Use Python dictionaries and sorting for a clean, production-ready RRF implementation.