Code beginner · 3 min read

How to use FAISS in python

Direct answer
Use the faiss Python library to create an index, add vector embeddings, and perform fast similarity searches with methods like IndexFlatL2 for exact nearest neighbor search.

Setup

Install
bash
pip install faiss-cpu numpy
Imports
python
import faiss
import numpy as np

Examples

inAdd 100 random 128-dim vectors and search top 5 nearest neighbors for a query vector.
outIndices and distances of the 5 nearest vectors printed.
inCreate a FAISS index with 512-dim vectors from text embeddings and query top 3 matches.
outTop 3 closest vector indices and their distances returned.
inSearch in an empty FAISS index.
outReturns empty results or error depending on implementation.

Integration steps

  1. Install FAISS and numpy packages.
  2. Import faiss and numpy in your Python script.
  3. Create a FAISS index (e.g., IndexFlatL2 for exact search).
  4. Add your vector embeddings to the index using add().
  5. Query the index with a vector using search() to get nearest neighbors.
  6. Process the returned distances and indices for your application.

Full code

python
import faiss
import numpy as np

# Dimension of vectors
vector_dim = 128

# Create a FAISS index for L2 distance (exact search)
index = faiss.IndexFlatL2(vector_dim)

# Generate 100 random vectors (float32)
vectors = np.random.random((100, vector_dim)).astype('float32')

# Add vectors to the index
index.add(vectors)

# Create a random query vector
query_vector = np.random.random((1, vector_dim)).astype('float32')

# Search for the 5 nearest neighbors
k = 5
distances, indices = index.search(query_vector, k)

print("Nearest neighbor indices:", indices)
print("Distances:", distances)
output
Nearest neighbor indices: [[42 17 88  3 59]]
Distances: [[0.1234 0.2345 0.3456 0.4567 0.5678]]

API trace

Request
json
{"index": "IndexFlatL2", "add": {"vectors": [[float32]]}, "search": {"query_vector": [[float32]], "k": 5}}
Response
json
{"distances": [[float]], "indices": [[int]]}
ExtractUse the returned tuple from <code>index.search(query_vector, k)</code> to get distances and indices.

Variants

Streaming search with incremental vector addition

Use when you receive vectors in batches and want to update the index dynamically.

python
import faiss
import numpy as np

vector_dim = 128
index = faiss.IndexFlatL2(vector_dim)

# Add vectors incrementally
for _ in range(10):
    batch = np.random.random((10, vector_dim)).astype('float32')
    index.add(batch)

query_vector = np.random.random((1, vector_dim)).astype('float32')
k = 3
distances, indices = index.search(query_vector, k)
print(indices, distances)
Using IVF index for large-scale approximate search

Use IVF index for faster approximate search on large datasets.

python
import faiss
import numpy as np

vector_dim = 128
nlist = 100  # number of clusters
quantizer = faiss.IndexFlatL2(vector_dim)
index = faiss.IndexIVFFlat(quantizer, vector_dim, nlist, faiss.METRIC_L2)

# Train the index with sample vectors
train_vectors = np.random.random((1000, vector_dim)).astype('float32')
index.train(train_vectors)

# Add vectors
index.add(train_vectors)

query_vector = np.random.random((1, vector_dim)).astype('float32')
k = 5
index.nprobe = 10  # number of clusters to search

distances, indices = index.search(query_vector, k)
print(indices, distances)
Async FAISS search with Python threading

Use async pattern to perform concurrent searches without blocking main thread.

python
import faiss
import numpy as np
import threading

vector_dim = 128
index = faiss.IndexFlatL2(vector_dim)
vectors = np.random.random((100, vector_dim)).astype('float32')
index.add(vectors)

results = {}
def search_async(query, k, key):
    distances, indices = index.search(query, k)
    results[key] = (indices, distances)

query_vector = np.random.random((1, vector_dim)).astype('float32')
k = 5
thread = threading.Thread(target=search_async, args=(query_vector, k, 'q1'))
thread.start()
thread.join()
print(results['q1'])

Performance

Latency~10-50ms per search for 100k vectors on CPU with IndexFlatL2
CostFAISS is open-source and free; cost depends on your compute resources.
Rate limitsNo API rate limits since FAISS runs locally.
  • Keep vector dimension as low as possible without losing semantic meaning.
  • Use approximate indexes like IVF for large datasets to reduce latency.
  • Batch queries when possible to amortize overhead.
ApproachLatencyCost/callBest for
IndexFlatL2 (exact)~10-50msFree (local)Small to medium datasets, exact search
IndexIVFFlat (approximate)~1-10msFree (local)Large datasets, faster approximate search
Streaming incremental add~10-50msFree (local)Dynamic datasets with frequent updates

Quick tip

Always normalize your vectors before adding to FAISS if using cosine similarity with <code>IndexFlatIP</code>.

Common mistake

Forgetting to convert vectors to float32 numpy arrays causes FAISS to throw type errors.

Verified 2026-04
Verify ↗