How to Intermediate · 3 min read

How to build custom reranker

Quick answer

Build a custom reranker by first retrieving candidate documents, then using an AI model like gpt-4o-mini to score or rank each candidate based on relevance to the query. Use the OpenAI SDK to send the query and candidates as messages, then reorder results by the model's scores or preferences.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable.

Run pip install openai
Set OPENAI_API_KEY in your environment

bash

pip install openai

Step by step

This example shows how to build a simple reranker that scores candidate texts against a query using gpt-4o-mini. It sends the query and each candidate to the model, asking for a relevance score, then sorts candidates by score.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

query = "What are the health benefits of green tea?"
candidates = [
    "Green tea contains antioxidants that improve brain function.",
    "The history of green tea dates back thousands of years.",
    "Green tea can help with weight loss and reduce risk of heart disease.",
    "Many people enjoy green tea for its taste and aroma."
]

reranked = []

for text in candidates:
    prompt = f"Score the relevance of this text to the query on a scale of 0 to 10:\nQuery: {query}\nText: {text}\nScore:"  
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=5,
        temperature=0
    )
    score_text = response.choices[0].message.content.strip()
    try:
        score = float(score_text)
    except ValueError:
        score = 0.0
    reranked.append((score, text))

reranked.sort(reverse=True, key=lambda x: x[0])

print("Reranked results:")
for score, text in reranked:
    print(f"Score: {score:.1f} - {text}")

output

Reranked results:
Score: 9.5 - Green tea can help with weight loss and reduce risk of heart disease.
Score: 8.7 - Green tea contains antioxidants that improve brain function.
Score: 3.2 - The history of green tea dates back thousands of years.
Score: 2.5 - Many people enjoy green tea for its taste and aroma.

Common variations

You can improve reranking by:

Using batch prompts to score multiple candidates in one request.
Applying different models like claude-3-5-sonnet-20241022 for better relevance scoring.
Using embeddings to compute similarity scores and combine with LLM scores.
Implementing async calls for faster scoring of many candidates.

python

import asyncio
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def score_text(text):
    prompt = f"Score the relevance of this text to the query on a scale of 0 to 10:\nQuery: {query}\nText: {text}\nScore:"
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=5,
        temperature=0
    )
    score_text = response.choices[0].message.content.strip()
    try:
        return float(score_text), text
    except ValueError:
        return 0.0, text

async def main():
    tasks = [score_text(text) for text in candidates]
    results = await asyncio.gather(*tasks)
    results.sort(reverse=True, key=lambda x: x[0])
    print("Async reranked results:")
    for score, text in results:
        print(f"Score: {score:.1f} - {text}")

# Run async example
# asyncio.run(main())

Troubleshooting

If scores are not numeric, ensure the prompt clearly asks for a numeric score only.
Handle API rate limits by adding retries or exponential backoff.
Check your API key environment variable is set correctly to avoid authentication errors.
For large candidate sets, consider batching or async calls to reduce latency and cost.

✅

Key Takeaways

Use an LLM like gpt-4o-mini to score candidate texts for relevance to a query.
Send clear prompts requesting numeric relevance scores to enable sorting.
Optimize reranking with async calls or batch processing for efficiency.

Verified 2026-04 · gpt-4o-mini, claude-3-5-sonnet-20241022

Verify ↗