How to Intermediate · 3 min read

Hybrid search with embeddings explained

Quick answer
Hybrid search combines vector embeddings similarity search with traditional keyword-based search to leverage the strengths of both. It first retrieves candidates using keyword filters, then reranks or expands results using embedding similarity, improving accuracy and relevance in information retrieval.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • pip install faiss-cpu

Setup

Install necessary packages for embeddings and vector search. Set your OpenAI API key as an environment variable.

bash
pip install openai faiss-cpu

Step by step

This example shows how to perform hybrid search by combining keyword filtering with embedding similarity using OpenAI embeddings and FAISS for vector search.

python
import os
from openai import OpenAI
import faiss
import numpy as np

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents with text and metadata
documents = [
    {"id": 1, "text": "Apple releases new iPhone models.", "category": "tech"},
    {"id": 2, "text": "Local elections coming up next month.", "category": "politics"},
    {"id": 3, "text": "New study shows health benefits of apples.", "category": "health"},
    {"id": 4, "text": "Tech giants invest in AI research.", "category": "tech"}
]

# Step 1: Filter documents by keyword (category = 'tech')
keyword_filtered_docs = [doc for doc in documents if doc["category"] == "tech"]

# Step 2: Get embeddings for filtered docs
texts = [doc["text"] for doc in keyword_filtered_docs]
response = client.embeddings.create(model="text-embedding-3-small", input=texts)
embeddings = np.array([data.embedding for data in response.data]).astype('float32')

# Step 3: Build FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

# Step 4: Query embedding
query = "latest AI technology"
query_response = client.embeddings.create(model="text-embedding-3-small", input=[query])
query_embedding = np.array(query_response.data[0].embedding).astype('float32').reshape(1, -1)

# Step 5: Search top 2 similar docs within filtered set
k = 2
distances, indices = index.search(query_embedding, k)

# Step 6: Retrieve results
results = [keyword_filtered_docs[i] for i in indices[0]]

print("Hybrid search results:")
for res in results:
    print(f"- {res['text']} (Category: {res['category']})")
output
Hybrid search results:
- Tech giants invest in AI research. (Category: tech)
- Apple releases new iPhone models. (Category: tech)

Common variations

You can implement hybrid search with different embedding models like gpt-4o-mini or claude-3-5-sonnet-20241022. Async calls improve throughput. Instead of filtering by metadata, you can combine keyword scores and embedding similarity scores for ranking.

python
import asyncio
from openai import OpenAI

async def async_hybrid_search():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    # Example async embedding calls and hybrid logic here
    pass

# Run async example
# asyncio.run(async_hybrid_search())

Troubleshooting

  • If embedding similarity returns irrelevant results, check your keyword filter to narrow down the candidate set.
  • If FAISS index throws dimension errors, verify embedding vector sizes match.
  • For large datasets, consider approximate nearest neighbor indexes like IndexIVFFlat for performance.

Key Takeaways

  • Hybrid search combines keyword filtering with embedding similarity for precise retrieval.
  • Use embeddings to rerank or expand keyword-filtered results for better relevance.
  • FAISS enables efficient vector similarity search on filtered document subsets.
Verified 2026-04 · text-embedding-3-small, gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗