Concept Beginner · 3 min read

What is cosine similarity in AI

Quick answer
Cosine similarity is a metric that measures the cosine of the angle between two non-zero vectors in a multi-dimensional space, indicating how similar they are regardless of magnitude. In AI, it is commonly used to compare text embeddings or feature vectors to find semantic similarity.
Cosine similarity is a mathematical measure that quantifies how similar two vectors are by calculating the cosine of the angle between them.

How it works

Cosine similarity calculates the cosine of the angle between two vectors, producing a value between -1 and 1. A value of 1 means the vectors point in the same direction (high similarity), 0 means they are orthogonal (no similarity), and -1 means they point in opposite directions. Imagine two arrows originating from the same point: the smaller the angle between them, the more similar their directions.

In AI, vectors often represent text or data features, so cosine similarity helps measure semantic closeness without being affected by vector length.

Concrete example

Given two vectors A and B, cosine similarity is computed as:

cosine_similarity = (A · B) / (||A|| * ||B||)

where · is the dot product and ||A|| is the Euclidean norm of vector A.

python
import numpy as np

# Example vectors
A = np.array([1, 2, 3])
B = np.array([4, 5, 6])

# Compute cosine similarity
dot_product = np.dot(A, B)
norm_A = np.linalg.norm(A)
norm_B = np.linalg.norm(B)
cosine_similarity = dot_product / (norm_A * norm_B)

print(f"Cosine similarity: {cosine_similarity:.4f}")
output
Cosine similarity: 0.9746

When to use it

Use cosine similarity when you need to measure the similarity between two vectors representing text, images, or other features, especially when vector magnitude is irrelevant. It is ideal for tasks like document retrieval, recommendation systems, and clustering in AI.

Avoid cosine similarity when vector magnitude matters or when vectors can have negative values that affect interpretation.

Key terms

TermDefinition
Cosine similarityA measure of similarity between two vectors based on the cosine of the angle between them.
Dot productAn algebraic operation that multiplies corresponding entries of two vectors and sums the results.
Euclidean normThe length or magnitude of a vector calculated as the square root of the sum of squared components.
Vector embeddingA numeric representation of data (like text) in a continuous vector space.

Key Takeaways

  • Cosine similarity measures how aligned two vectors are, ignoring their magnitude.
  • It is widely used in AI for comparing text embeddings and feature vectors.
  • Calculate it using the dot product divided by the product of vector norms.
  • Ideal for semantic similarity tasks like search and recommendation.
  • Not suitable when vector magnitude or direction sign is critical.
Verified 2026-04
Verify ↗