How to beginner · 3 min read

How to search with FAISS

Quick answer
Use FAISS to build a vector index and perform similarity search by adding vectors to an index and querying it with a query vector. The IndexFlatL2 index supports exact nearest neighbor search using L2 distance, and you can retrieve the top-k closest vectors efficiently.

PREREQUISITES

  • Python 3.8+
  • pip install faiss-cpu
  • Basic knowledge of vector embeddings

Setup

Install the faiss-cpu package via pip to get FAISS functionality on CPU. Import necessary modules and prepare your environment.

bash
pip install faiss-cpu

Step by step

This example shows how to create a FAISS index, add vectors, and perform a similarity search to find the nearest neighbors.

python
import numpy as np
import faiss

# Create some sample data: 5 vectors of dimension 4
vectors = np.array([
    [1.0, 0.0, 0.0, 0.0],
    [0.0, 1.0, 0.0, 0.0],
    [0.0, 0.0, 1.0, 0.0],
    [0.0, 0.0, 0.0, 1.0],
    [1.0, 1.0, 0.0, 0.0]
], dtype='float32')

# Initialize a FAISS index for L2 distance (exact search)
index = faiss.IndexFlatL2(4)  # 4 is the vector dimension

# Add vectors to the index
index.add(vectors)

# Query vector
query = np.array([[1.0, 0.0, 0.0, 0.0]], dtype='float32')

# Search for the 3 nearest neighbors
k = 3
D, I = index.search(query, k)  # D = distances, I = indices

print("Indices of nearest neighbors:", I)
print("Distances to nearest neighbors:", D)
output
Indices of nearest neighbors: [[0 4 1]]
Distances to nearest neighbors: [[0. 1. 2.]]

Common variations

You can use other FAISS index types for approximate search like IndexIVFFlat for large datasets. GPU support is available with faiss-gpu. For cosine similarity, normalize vectors before indexing.

python
import numpy as np
import faiss

# Example: Normalize vectors for cosine similarity
vectors = np.array([[1, 2, 3], [4, 5, 6]], dtype='float32')
faiss.normalize_L2(vectors)

# Create index for inner product (cosine similarity after normalization)
index = faiss.IndexFlatIP(3)
index.add(vectors)

query = np.array([[1, 0, 0]], dtype='float32')
faiss.normalize_L2(query)

D, I = index.search(query, 1)
print("Nearest neighbor index:", I)
print("Similarity score:", D)
output
Nearest neighbor index: [[0]]
Similarity score: [[0.26726124]]

Troubleshooting

  • If you get a dimension mismatch error, verify your vectors and query have the same dimension.
  • For large datasets, use an approximate index like IndexIVFFlat and train it before adding vectors.
  • Ensure vectors are float32 numpy arrays; other types cause errors.

Key Takeaways

  • Use faiss.IndexFlatL2 for exact nearest neighbor search with L2 distance.
  • Normalize vectors to use cosine similarity with IndexFlatIP.
  • For large datasets, use approximate indexes like IndexIVFFlat with training.
  • Always ensure vectors are numpy arrays of dtype float32.
  • FAISS supports GPU acceleration via faiss-gpu for faster search.
Verified 2026-04
Verify ↗