Code intermediate · 3 min read

How to use Cohere rerank API in Python

Direct answer
Use the cohere Python SDK to call client.rerank with your query and candidate texts, providing your API key via os.environ.

Setup

Install
bash
pip install cohere
Env vars
COHERE_API_KEY
Imports
python
import os
import cohere

Examples

inQuery: 'best programming language for AI' Candidates: ['Python is great for AI', 'Java is popular', 'AI uses many languages']
out[{'text': 'Python is great for AI', 'relevance_score': 0.95}, {'text': 'AI uses many languages', 'relevance_score': 0.75}, {'text': 'Java is popular', 'relevance_score': 0.40}]
inQuery: 'top travel destinations in Europe' Candidates: ['Paris is beautiful', 'New York is busy', 'Rome has history']
out[{'text': 'Paris is beautiful', 'relevance_score': 0.92}, {'text': 'Rome has history', 'relevance_score': 0.88}, {'text': 'New York is busy', 'relevance_score': 0.30}]
inQuery: 'healthy breakfast options' Candidates: ['Pancakes are tasty', 'Oatmeal is nutritious', 'Burgers for breakfast']
out[{'text': 'Oatmeal is nutritious', 'relevance_score': 0.90}, {'text': 'Pancakes are tasty', 'relevance_score': 0.60}, {'text': 'Burgers for breakfast', 'relevance_score': 0.20}]

Integration steps

  1. Install the Cohere Python SDK and set your API key in the environment variable COHERE_API_KEY
  2. Import the cohere package and initialize the client with your API key from os.environ
  3. Prepare your query string and a list of candidate texts to rerank
  4. Call client.rerank with the model name, query, and candidate texts
  5. Parse the response to get the candidates sorted by their relevance scores
  6. Use the ranked results in your application as needed

Full code

python
import os
import cohere

# Initialize Cohere client with API key from environment
client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

# Define the query and candidate texts
query = "What is the best programming language for AI?"
candidates = [
    "Python is great for AI",
    "Java is popular",
    "AI uses many languages"
]

# Call the rerank endpoint
response = client.rerank(
    model="rerank-english-v2.0",
    query=query,
    documents=candidates
)

# Print candidates sorted by relevance score
ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
for result in ranked:
    print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")
output
Text: Python is great for AI
Score: 0.9500

Text: AI uses many languages
Score: 0.7500

Text: Java is popular
Score: 0.4000

API trace

Request
json
{"model": "rerank-english-v2.0", "query": "What is the best programming language for AI?", "documents": ["Python is great for AI", "Java is popular", "AI uses many languages"]}
Response
json
{"results": [{"document": "Python is great for AI", "relevance_score": 0.95}, {"document": "AI uses many languages", "relevance_score": 0.75}, {"document": "Java is popular", "relevance_score": 0.40}]}
Extractsorted(response.results, key=lambda x: x.relevance_score, reverse=True)

Variants

Async version

Use when you want to perform reranking asynchronously to improve concurrency in your application.

python
import os
import asyncio
import cohere

async def rerank_async():
    client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])
    query = "Best programming language for AI?"
    candidates = ["Python is great for AI", "Java is popular", "AI uses many languages"]
    response = await client.rerank.async_call(
        model="rerank-english-v2.0",
        query=query,
        documents=candidates
    )
    ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
    for result in ranked:
        print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")

asyncio.run(rerank_async())
Batch reranking multiple queries

Use when you need to rerank the same set of candidates against multiple queries in a batch.

python
import os
import cohere

client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

queries = [
    "Best programming language for AI?",
    "Top travel destinations in Europe"
]
candidates = [
    "Python is great for AI",
    "Java is popular",
    "Paris is beautiful",
    "Rome has history"
]

for query in queries:
    response = client.rerank(
        model="rerank-english-v2.0",
        query=query,
        documents=candidates
    )
    ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
    print(f"Query: {query}")
    for result in ranked:
        print(f"  {result.document}: {result.relevance_score:.4f}")
    print()
Alternative model for reranking

Use the multilingual rerank model when your candidates or queries include multiple languages.

python
import os
import cohere

client = cohere.Client(api_key=os.environ["COHERE_API_KEY"])

query = "Healthy breakfast options"
candidates = ["Pancakes are tasty", "Oatmeal is nutritious", "Burgers for breakfast"]

response = client.rerank(
    model="rerank-multilingual-v1.0",
    query=query,
    documents=candidates
)

ranked = sorted(response.results, key=lambda x: x.relevance_score, reverse=True)
for result in ranked:
    print(f"Text: {result.document}\nScore: {result.relevance_score:.4f}\n")

Performance

Latency~300-600ms per rerank call for typical candidate sets
Cost~$0.0015 per 1000 tokens reranked (check Cohere pricing for exact rates)
Rate limitsDefault tier: 60 requests per minute, 100,000 tokens per minute
  • Limit candidate texts length to reduce token usage
  • Batch multiple queries when possible to optimize throughput
  • Use concise queries to minimize token count
ApproachLatencyCost/callBest for
Standard rerank call~300-600ms~$0.0015 per 1000 tokensSingle query reranking
Async rerank call~300-600ms (concurrent)~$0.0015 per 1000 tokensHigh concurrency applications
Batch rerankingVaries by batch sizeMore cost efficient per queryMultiple queries with shared candidates

Quick tip

Always provide a clear, concise query and well-defined candidate texts to get the most accurate reranking results from Cohere.

Common mistake

Beginners often forget to pass the candidate texts as a list of strings under the 'documents' parameter, causing API errors.

Verified 2026-04 · rerank-english-v2.0, rerank-multilingual-v1.0
Verify ↗