Code intermediate · 3 min read

How to use Cohere rerank API in Python

Direct answer

Use the cohere Python SDK to call client.rerank with your query and candidate texts, providing your API key via os.environ.

Setup

Install

bash

pip install cohere

Env vars

COHERE_API_KEY

Imports

python

import os
import cohere

Examples

inQuery: 'best programming language for AI' Candidates: ['Python is great for AI', 'Java is popular', 'AI uses many languages']

out[{'text': 'Python is great for AI', 'relevance_score': 0.95}, {'text': 'AI uses many languages', 'relevance_score': 0.75}, {'text': 'Java is popular', 'relevance_score': 0.40}]

inQuery: 'top travel destinations in Europe' Candidates: ['Paris is beautiful', 'New York is busy', 'Rome has history']

out[{'text': 'Paris is beautiful', 'relevance_score': 0.92}, {'text': 'Rome has history', 'relevance_score': 0.88}, {'text': 'New York is busy', 'relevance_score': 0.30}]

inQuery: 'healthy breakfast options' Candidates: ['Pancakes are tasty', 'Oatmeal is nutritious', 'Burgers for breakfast']

out[{'text': 'Oatmeal is nutritious', 'relevance_score': 0.90}, {'text': 'Pancakes are tasty', 'relevance_score': 0.60}, {'text': 'Burgers for breakfast', 'relevance_score': 0.20}]

Integration steps

Install the Cohere Python SDK and set your API key in the environment variable COHERE_API_KEY
Import the cohere package and initialize the client with your API key from os.environ
Prepare your query string and a list of candidate texts to rerank
Call client.rerank with the model name, query, and candidate texts
Parse the response to get the candidates sorted by their relevance scores
Use the ranked results in your application as needed

Full code

python

import os
import cohere

# Initialize Cohere client with API key from environment
client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

# Define the query and candidate texts
query = "What is the best programming language for AI?"
candidates = [
    "Python is great for AI",
    "Java is popular",
    "AI uses many languages"
]

# Call the rerank endpoint
response = client.rerank(
    model="rerank-english-v2.0",
    query=query,
    documents=candidates
)

# Print candidates sorted by relevance score
ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
for result in ranked:
    print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")

output

Text: Python is great for AI
Score: 0.9500

Text: AI uses many languages
Score: 0.7500

Text: Java is popular
Score: 0.4000

API trace

Request

json

{"model": "rerank-english-v2.0", "query": "What is the best programming language for AI?", "documents": ["Python is great for AI", "Java is popular", "AI uses many languages"]}

Response

json

{"results": [{"document": "Python is great for AI", "relevance_score": 0.95}, {"document": "AI uses many languages", "relevance_score": 0.75}, {"document": "Java is popular", "relevance_score": 0.40}]}

Extractsorted(response.results, key=lambda x: x.relevance_score, reverse=True)

Variants

Async version ›

Use when you want to perform reranking asynchronously to improve concurrency in your application.

python

import os
import asyncio
import cohere

async def rerank_async():
    client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])
    query = "Best programming language for AI?"
    candidates = ["Python is great for AI", "Java is popular", "AI uses many languages"]
    response = await client.rerank.async_call(
        model="rerank-english-v2.0",
        query=query,
        documents=candidates
    )
    ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
    for result in ranked:
        print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")

asyncio.run(rerank_async())

Batch reranking multiple queries ›

Use when you need to rerank the same set of candidates against multiple queries in a batch.

python

import os
import cohere

client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

queries = [
    "Best programming language for AI?",
    "Top travel destinations in Europe"
]
candidates = [
    "Python is great for AI",
    "Java is popular",
    "Paris is beautiful",
    "Rome has history"
]

for query in queries:
    response = client.rerank(
        model="rerank-english-v2.0",
        query=query,
        documents=candidates
    )
    ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
    print(f"Query: {query}")
    for result in ranked:
        print(f"  {result.document}: {result.relevance_score:.4f}")
    print()

Alternative model for reranking ›

Use the multilingual rerank model when your candidates or queries include multiple languages.

python

import os
import cohere

client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

query = "Healthy breakfast options"
candidates = ["Pancakes are tasty", "Oatmeal is nutritious", "Burgers for breakfast"]

response = client.rerank(
    model="rerank-multilingual-v1.0",
    query=query,
    documents=candidates
)

ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
for result in ranked:
    print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")

Performance

Latency~300-600ms per rerank call for typical candidate sets

Cost~$0.0015 per 1000 tokens reranked (check Cohere pricing for exact rates)

Rate limitsDefault tier: 60 requests per minute, 100,000 tokens per minute

Limit candidate texts length to reduce token usage
Batch multiple queries when possible to optimize throughput
Use concise queries to minimize token count

Approach	Latency	Cost/call	Best for
Standard rerank call	~300-600ms	~$0.0015 per 1000 tokens	Single query reranking
Async rerank call	~300-600ms (concurrent)	~$0.0015 per 1000 tokens	High concurrency applications
Batch reranking	Varies by batch size	More cost efficient per query	Multiple queries with shared candidates

✓

Quick tip

Always provide a clear, concise query and well-defined candidate texts to get the most accurate reranking results from Cohere.

⚠

Common mistake

Beginners often forget to pass the candidate texts as a list of strings under the 'documents' parameter, causing API errors.

Verified 2026-04 · rerank-english-v2.0, rerank-multilingual-v1.0

Verify ↗