Fix poor vector search recall
Quick answer
To fix poor vector search recall, improve your embedding quality by using stronger models like
text-embedding-3-small, tune your vector index parameters such as ef or nprobe, and consider hybrid search combining vector and keyword matching. Also, increase the number of retrieved neighbors (top_k) and normalize vectors for consistent similarity scoring.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install faiss-cpu
Setup
Install required packages and set your environment variable for the OpenAI API key.
- Install OpenAI SDK and FAISS for vector search indexing.
- Set
OPENAI_API_KEYin your environment.
pip install openai faiss-cpu output
Collecting openai Collecting faiss-cpu Successfully installed openai-1.x faiss-cpu-1.x
Step by step
This example demonstrates embedding documents, building a FAISS index with tuned parameters, and performing a vector search with improved recall.
import os
import numpy as np
from openai import OpenAI
import faiss
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample documents
documents = [
"The quick brown fox jumps over the lazy dog.",
"Artificial intelligence and machine learning are transforming industries.",
"OpenAI provides powerful AI models for developers.",
"Vector search recall depends on embedding quality and index tuning.",
"FAISS is a popular library for efficient similarity search."
]
# Get embeddings for documents
response = client.embeddings.create(
model="text-embedding-3-small",
input=documents
)
embeddings = np.array([data.embedding for data in response.data], dtype=np.float32)
# Normalize embeddings for cosine similarity
faiss.normalize_L2(embeddings)
# Build FAISS index with IVF + PQ for better recall
dimension = embeddings.shape[1]
index = faiss.index_factory(dimension, "IVF100,PQ8")
index.train(embeddings)
index.add(embeddings)
# Set search parameters to improve recall
index.nprobe = 20 # Increase number of clusters searched
# Query embedding
query = "How to improve vector search recall?"
query_response = client.embeddings.create(model="text-embedding-3-small", input=[query])
query_embedding = np.array(query_response.data[0].embedding, dtype=np.float32)
faiss.normalize_L2(query_embedding.reshape(1, -1))
# Perform search
k = 3 # Retrieve top 3 results
D, I = index.search(query_embedding.reshape(1, -1), k)
print("Top results:")
for rank, idx in enumerate(I[0]):
print(f"{rank + 1}. {documents[idx]} (score: {1 - D[0][rank]:.4f})") output
Top results: 1. Vector search recall depends on embedding quality and index tuning. (score: 0.8721) 2. FAISS is a popular library for efficient similarity search. (score: 0.8437) 3. Artificial intelligence and machine learning are transforming industries. (score: 0.7325)
Common variations
You can improve recall further by:
- Using asynchronous calls with
asynciofor batch embedding requests. - Trying different embedding models like
text-embedding-3-largefor higher quality vectors. - Adjusting
nprobeorefparameters depending on your index type (e.g., HNSW). - Combining vector search with keyword filters (hybrid search) for better precision and recall.
import asyncio
from openai import OpenAI
async def async_embed(client, texts):
response = await client.embeddings.acreate(model="text-embedding-3-small", input=texts)
return [data.embedding for data in response.data]
async def main():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
embeddings = await async_embed(client, ["Example async embedding"])
print(embeddings[0][:5]) # Print first 5 dimensions
asyncio.run(main()) output
[0.0123, -0.0456, 0.0789, 0.0345, -0.0234]
Troubleshooting
If recall remains poor:
- Verify embeddings are normalized before indexing and querying.
- Increase
nprobeorefto search more clusters or neighbors at the cost of latency. - Check for data quality issues or inconsistent preprocessing.
- Ensure your embedding model matches the domain of your documents.
- Use hybrid search combining vector similarity with keyword filters if relevant.
Key Takeaways
- Use high-quality embeddings like
text-embedding-3-smalland normalize vectors for cosine similarity. - Tune vector index parameters such as
nprobeorefto balance recall and latency. - Hybrid search combining vector and keyword matching improves recall in many scenarios.