How to beginner · 3 min read

How to do similarity search with embeddings

Quick answer
Use a model like text-embedding-3-small to convert text into vector embeddings, then compute similarity (e.g., cosine similarity) between vectors to find the closest matches. Store embeddings in a vector database or in-memory structure for efficient retrieval.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • pip install numpy

Setup

Install the openai Python package and set your API key as an environment variable. Also install numpy for vector math.

bash
pip install openai numpy

Step by step

This example shows how to embed a list of documents, embed a query, and find the most similar document using cosine similarity.

python
import os
import numpy as np
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def get_embedding(text):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return np.array(response.data[0].embedding)

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Sample documents
documents = [
    "The quick brown fox jumps over the lazy dog.",
    "Artificial intelligence and machine learning are fascinating.",
    "OpenAI provides powerful language models."
]

# Embed documents
doc_embeddings = [get_embedding(doc) for doc in documents]

# Query
query = "Tell me about AI and models"
query_embedding = get_embedding(query)

# Compute similarities
similarities = [cosine_similarity(query_embedding, doc_emb) for doc_emb in doc_embeddings]

# Find best match
best_idx = np.argmax(similarities)
print(f"Most similar document: {documents[best_idx]}")
output
Most similar document: Artificial intelligence and machine learning are fascinating.

Common variations

  • Use vector databases like FAISS or Chroma for scalable similarity search.
  • Use async calls if embedding large batches.
  • Try different embedding models like text-embedding-3-large for higher quality.
  • Use other similarity metrics like Euclidean distance if preferred.

Troubleshooting

  • If embeddings are slow, batch inputs to reduce API calls.
  • If similarity scores are low, verify text preprocessing (e.g., lowercase, remove punctuation).
  • Check your API key and environment variable if authentication errors occur.
  • Ensure numpy is installed for vector math.

Key Takeaways

  • Convert text to embeddings using text-embedding-3-small for similarity search.
  • Compute cosine similarity between query and document embeddings to find closest matches.
  • Use vector databases like FAISS for large-scale, efficient similarity search.
  • Batch embedding requests to optimize API usage and speed.
  • Preprocess text consistently to improve embedding quality and similarity accuracy.
Verified 2026-04 · text-embedding-3-small
Verify ↗