Comparison beginner · 4 min read

Cosine similarity vs dot product comparison

Quick answer
Use cosine similarity to measure the angle-based similarity between embeddings, normalizing for vector length. Use dot product when vector magnitude matters, as it combines length and direction in similarity scoring.

VERDICT

Use cosine similarity for most embedding similarity tasks due to its normalization and scale invariance; use dot product when embedding magnitude encodes meaningful information or for efficient approximate nearest neighbor search.
MetricDefinitionRangeNormalizationBest forComputational cost
Cosine similarityMeasures cosine of angle between vectors-1 to 1Yes, vectors normalizedSemantic similarity, scale-invariant tasksModerate (normalization + dot)
Dot productSum of element-wise productsUnbounded (depends on vector length)NoMagnitude-sensitive tasks, fast similarityLow (simple vector multiplication)
Use case exampleText embedding similarityImage feature matchingVector length irrelevantVector length encodes importanceDepends on vector size
InterpretationSimilarity based on direction onlySimilarity influenced by length and directionEnsures fair comparisonCaptures strength and presenceFaster but less robust

Key differences

Cosine similarity measures the cosine of the angle between two vectors, focusing on their direction and normalizing for length. Dot product multiplies vectors element-wise and sums the result, combining both magnitude and direction. This means cosine similarity is scale-invariant, while dot product is sensitive to vector length.

Cosine similarity ranges from -1 to 1, making it interpretable as a normalized similarity score. Dot product values are unbounded and depend on vector magnitudes, which can bias similarity if vector lengths vary.

Side-by-side example

python
import numpy as np

# Example vectors
vec_a = np.array([1, 2, 3])
vec_b = np.array([4, 5, 6])

# Dot product
dot = np.dot(vec_a, vec_b)

# Cosine similarity
cosine = dot / (np.linalg.norm(vec_a) * np.linalg.norm(vec_b))

print(f"Dot product: {dot}")
print(f"Cosine similarity: {cosine:.4f}")
output
Dot product: 32
Cosine similarity: 0.9746

Dot product equivalent

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Embeddings for two texts
response_a = client.embeddings.create(model="text-embedding-3-small", input="Hello world")
response_b = client.embeddings.create(model="text-embedding-3-small", input="Hi there")

vec_a = response_a.data[0].embedding
vec_b = response_b.data[0].embedding

# Compute dot product similarity
similarity = sum(a * b for a, b in zip(vec_a, vec_b))
print(f"Dot product similarity: {similarity}")
output
Dot product similarity: 12.345678 (example value)

When to use each

Use cosine similarity when you want to compare embeddings regardless of their magnitude, such as semantic similarity in NLP or image features where scale varies. Use dot product when vector length encodes meaningful information, like confidence or frequency, or when you need faster approximate similarity computations.

In practice, cosine similarity is preferred for normalized embedding comparisons, while dot product is common in recommendation systems and some vector databases optimized for speed.

Use casePreferred metricReason
Semantic text similarityCosine similarityNormalizes vector length for fair comparison
Recommendation rankingDot productCaptures magnitude as importance or confidence
Approximate nearest neighbor searchDot productFaster computation, hardware optimized
Image feature matchingCosine similarityFocuses on direction, ignoring scale

Pricing and access

Both cosine similarity and dot product are mathematical operations you implement locally or in your vector database. Embeddings to use with these metrics come from APIs like OpenAI or Anthropic, which have their own pricing.

OptionFreePaidAPI access
OpenAI embeddingsYes (limited)YesOpenAI API with text-embedding-3-small model
Anthropic embeddingsCheck pricingYesAnthropic API with claude-3-5-sonnet embeddings
Local embeddingsYesNoUse sentence-transformers or similar
Vector DB similarityDepends on DBDepends on DBMany support both cosine and dot product

Key Takeaways

  • Cosine similarity normalizes vectors, making it ideal for semantic similarity tasks.
  • Dot product includes vector magnitude, useful when length encodes importance.
  • Use cosine similarity for fair comparison; use dot product for speed or magnitude-sensitive use cases.
Verified 2026-04 · text-embedding-3-small, claude-3-5-sonnet-20241022
Verify ↗