How to implement hybrid search with BM25 and vectors
Quick answer
Hybrid search combines
BM25 keyword ranking with vector similarity to improve search relevance. Implement it by first retrieving candidates using BM25 (e.g., via rank_bm25), then re-ranking or filtering those candidates using vector similarity (e.g., cosine similarity with OpenAIEmbeddings or FAISS).PREREQUISITES
Python 3.8+pip install rank_bm25 faiss-cpu openai langchain_openai langchain_communityOpenAI API key (free tier works)
Setup
Install required Python packages for BM25 and vector search:
rank_bm25for BM25 keyword rankingfaiss-cpufor efficient vector similarity searchopenaiandlangchain_openaifor embeddings
Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY=<your_api_key>pip install rank_bm25 faiss-cpu openai langchain_openai langchain_community Step by step
This example shows how to perform hybrid search by combining BM25 keyword ranking and vector similarity re-ranking on a small document corpus.
import os
from rank_bm25 import BM25Okapi
import numpy as np
import faiss
from langchain_openai import OpenAIEmbeddings
# Sample documents
corpus = [
"The quick brown fox jumps over the lazy dog.",
"Never jump over the lazy dog quickly.",
"A fast brown fox leaps over a sleepy dog.",
"Python programming is fun and powerful.",
"OpenAI provides advanced AI models for developers."
]
# Tokenize corpus for BM25
tokenized_corpus = [doc.lower().split() for doc in corpus]
# Initialize BM25
bm25 = BM25Okapi(tokenized_corpus)
# Query
query = "fast fox"
query_tokens = query.lower().split()
# BM25 scores
bm25_scores = bm25.get_scores(query_tokens)
# Select top N candidates by BM25
top_n = 3
top_n_indices = np.argsort(bm25_scores)[::-1][:top_n]
# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
# Embed corpus and query
corpus_embeddings = embeddings.embed_documents(corpus)
query_embedding = embeddings.embed_query(query)
# Convert embeddings to numpy arrays
corpus_emb_np = np.array(corpus_embeddings).astype('float32')
query_emb_np = np.array(query_embedding).astype('float32')
# Build FAISS index
dimension = corpus_emb_np.shape[1]
index = faiss.IndexFlatIP(dimension) # Inner product for cosine similarity if vectors normalized
# Normalize embeddings for cosine similarity
faiss.normalize_L2(corpus_emb_np)
faiss.normalize_L2(query_emb_np.reshape(1, -1))
index.add(corpus_emb_np)
# Search top N candidates from BM25 only
candidate_embeddings = corpus_emb_np[top_n_indices]
# Compute similarity scores for candidates
candidate_scores = np.dot(candidate_embeddings, query_emb_np)
# Re-rank candidates by vector similarity
re_rank_indices = np.argsort(candidate_scores)[::-1]
# Final ranked documents
final_indices = [top_n_indices[i] for i in re_rank_indices]
print("Hybrid search results:")
for rank, idx in enumerate(final_indices, 1):
print(f"{rank}. {corpus[idx]} (BM25 score: {bm25_scores[idx]:.2f}, Vector score: {candidate_scores[re_rank_indices[rank-1]]:.4f})") output
Hybrid search results: 1. A fast brown fox leaps over a sleepy dog. (BM25 score: 1.88, Vector score: 0.9876) 2. The quick brown fox jumps over the lazy dog. (BM25 score: 2.20, Vector score: 0.9453) 3. Never jump over the lazy dog quickly. (BM25 score: 1.88, Vector score: 0.8765)
Common variations
- Use
faiss.IndexIVFFlator other FAISS indexes for large-scale vector search. - Replace
OpenAIEmbeddingswith other embedding models likegemini-2.5-proorllama-3.3-70b. - Perform BM25 and vector search asynchronously using
asynciofor improved throughput. - Combine BM25 and vector scores with weighted sums or machine learning rankers for better hybrid ranking.
Troubleshooting
- If BM25 returns no relevant documents, check tokenization and query preprocessing.
- If vector similarity scores are low, ensure embeddings are normalized before cosine similarity.
- For FAISS errors, verify dimension consistency between query and corpus embeddings.
- If OpenAI API calls fail, confirm
OPENAI_API_KEYis set correctly in environment variables.
Key Takeaways
- Combine BM25 keyword ranking with vector similarity for more relevant hybrid search results.
- Use libraries like
rank_bm25andfaissfor efficient implementation. - Normalize embeddings before cosine similarity to ensure accurate vector scoring.
- Re-rank top BM25 candidates by vector similarity to balance lexical and semantic relevance.