How to Intermediate · 3 min read

How to build custom reranker

Quick answer
Build a custom reranker by first retrieving candidate documents, then using an AI model like gpt-4o-mini to score or rank each candidate based on relevance to the query. Use the OpenAI SDK to send the query and candidates as messages, then reorder results by the model's scores or preferences.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable.

  • Run pip install openai
  • Set OPENAI_API_KEY in your environment
bash
pip install openai

Step by step

This example shows how to build a simple reranker that scores candidate texts against a query using gpt-4o-mini. It sends the query and each candidate to the model, asking for a relevance score, then sorts candidates by score.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

query = "What are the health benefits of green tea?"
candidates = [
    "Green tea contains antioxidants that improve brain function.",
    "The history of green tea dates back thousands of years.",
    "Green tea can help with weight loss and reduce risk of heart disease.",
    "Many people enjoy green tea for its taste and aroma."
]

reranked = []

for text in candidates:
    prompt = f"Score the relevance of this text to the query on a scale of 0 to 10:\nQuery: {query}\nText: {text}\nScore:"  
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=5,
        temperature=0
    )
    score_text = response.choices[0].message.content.strip()
    try:
        score = float(score_text)
    except ValueError:
        score = 0.0
    reranked.append((score, text))

reranked.sort(reverse=True, key=lambda x: x[0])

print("Reranked results:")
for score, text in reranked:
    print(f"Score: {score:.1f} - {text}")
output
Reranked results:
Score: 9.5 - Green tea can help with weight loss and reduce risk of heart disease.
Score: 8.7 - Green tea contains antioxidants that improve brain function.
Score: 3.2 - The history of green tea dates back thousands of years.
Score: 2.5 - Many people enjoy green tea for its taste and aroma.

Common variations

You can improve reranking by:

  • Using batch prompts to score multiple candidates in one request.
  • Applying different models like claude-3-5-sonnet-20241022 for better relevance scoring.
  • Using embeddings to compute similarity scores and combine with LLM scores.
  • Implementing async calls for faster scoring of many candidates.
python
import asyncio
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def score_text(text):
    prompt = f"Score the relevance of this text to the query on a scale of 0 to 10:\nQuery: {query}\nText: {text}\nScore:"
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=5,
        temperature=0
    )
    score_text = response.choices[0].message.content.strip()
    try:
        return float(score_text), text
    except ValueError:
        return 0.0, text

async def main():
    tasks = [score_text(text) for text in candidates]
    results = await asyncio.gather(*tasks)
    results.sort(reverse=True, key=lambda x: x[0])
    print("Async reranked results:")
    for score, text in results:
        print(f"Score: {score:.1f} - {text}")

# Run async example
# asyncio.run(main())

Troubleshooting

  • If scores are not numeric, ensure the prompt clearly asks for a numeric score only.
  • Handle API rate limits by adding retries or exponential backoff.
  • Check your API key environment variable is set correctly to avoid authentication errors.
  • For large candidate sets, consider batching or async calls to reduce latency and cost.

Key Takeaways

  • Use an LLM like gpt-4o-mini to score candidate texts for relevance to a query.
  • Send clear prompts requesting numeric relevance scores to enable sorting.
  • Optimize reranking with async calls or batch processing for efficiency.
Verified 2026-04 · gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗