How to intermediate · 3 min read

Fix poor vector search recall

Quick answer

To fix poor vector search recall, improve your embedding quality by using stronger models like text-embedding-3-small, tune your vector index parameters such as ef or nprobe, and consider hybrid search combining vector and keyword matching. Also, increase the number of retrieved neighbors (top_k) and normalize vectors for consistent similarity scoring.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install faiss-cpu

Setup

Install required packages and set your environment variable for the OpenAI API key.

Install OpenAI SDK and FAISS for vector search indexing.
Set OPENAI_API_KEY in your environment.

bash

pip install openai faiss-cpu

output

Collecting openai
Collecting faiss-cpu
Successfully installed openai-1.x faiss-cpu-1.x

Step by step

This example demonstrates embedding documents, building a FAISS index with tuned parameters, and performing a vector search with improved recall.

python

import os
import numpy as np
from openai import OpenAI
import faiss

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents
documents = [
    "The quick brown fox jumps over the lazy dog.",
    "Artificial intelligence and machine learning are transforming industries.",
    "OpenAI provides powerful AI models for developers.",
    "Vector search recall depends on embedding quality and index tuning.",
    "FAISS is a popular library for efficient similarity search."
]

# Get embeddings for documents
response = client.embeddings.create(
    model="text-embedding-3-small",
    input=documents
)
embeddings = np.array([data.embedding for data in response.data], dtype=np.float32)

# Normalize embeddings for cosine similarity
faiss.normalize_L2(embeddings)

# Build FAISS index with IVF + PQ for better recall
dimension = embeddings.shape[1]
index = faiss.index_factory(dimension, "IVF100,PQ8")
index.train(embeddings)
index.add(embeddings)

# Set search parameters to improve recall
index.nprobe = 20  # Increase number of clusters searched

# Query embedding
query = "How to improve vector search recall?"
query_response = client.embeddings.create(model="text-embedding-3-small", input=[query])
query_embedding = np.array(query_response.data[0].embedding, dtype=np.float32)
faiss.normalize_L2(query_embedding.reshape(1, -1))

# Perform search
k = 3  # Retrieve top 3 results
D, I = index.search(query_embedding.reshape(1, -1), k)

print("Top results:")
for rank, idx in enumerate(I[0]):
    print(f"{rank + 1}. {documents[idx]} (score: {1 - D[0][rank]:.4f})")

output

Top results:
1. Vector search recall depends on embedding quality and index tuning. (score: 0.8721)
2. FAISS is a popular library for efficient similarity search. (score: 0.8437)
3. Artificial intelligence and machine learning are transforming industries. (score: 0.7325)

Common variations

You can improve recall further by:

Using asynchronous calls with asyncio for batch embedding requests.
Trying different embedding models like text-embedding-3-large for higher quality vectors.
Adjusting nprobe or ef parameters depending on your index type (e.g., HNSW).
Combining vector search with keyword filters (hybrid search) for better precision and recall.

python

import asyncio
from openai import OpenAI

async def async_embed(client, texts):
    response = await client.embeddings.acreate(model="text-embedding-3-small", input=texts)
    return [data.embedding for data in response.data]

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    embeddings = await async_embed(client, ["Example async embedding"])
    print(embeddings[0][:5])  # Print first 5 dimensions

asyncio.run(main())

output

[0.0123, -0.0456, 0.0789, 0.0345, -0.0234]

Troubleshooting

If recall remains poor:

Verify embeddings are normalized before indexing and querying.
Increase nprobe or ef to search more clusters or neighbors at the cost of latency.
Check for data quality issues or inconsistent preprocessing.
Ensure your embedding model matches the domain of your documents.
Use hybrid search combining vector similarity with keyword filters if relevant.

✅

Key Takeaways

Use high-quality embeddings like text-embedding-3-small and normalize vectors for cosine similarity.
Tune vector index parameters such as nprobe or ef to balance recall and latency.
Hybrid search combining vector and keyword matching improves recall in many scenarios.

Verified 2026-04 · text-embedding-3-small

Verify ↗