What is MMR in vector search
MMR (Maximal Marginal Relevance) is a re-ranking algorithm that balances the relevance of results with their diversity to reduce redundancy. It selects vectors that are both similar to the query and dissimilar to each other, improving the quality of search results.How it works
MMR works by iteratively selecting vectors that maximize relevance to the query while minimizing similarity to already selected results. This ensures the final set of results is both relevant and diverse, avoiding repetitive or near-duplicate items. Think of it like picking a playlist: you want songs you like (relevance) but also variety (diversity) so the list isn’t monotonous.
Concrete example
The following Python example demonstrates a simple MMR implementation using cosine similarity to re-rank vector search results:
import os
import numpy as np
from openai import OpenAI
from sklearn.metrics.pairwise import cosine_similarity
# Dummy vectors for illustration
query_vector = np.array([[0.1, 0.3, 0.5]])
doc_vectors = np.array([
[0.1, 0.3, 0.5], # Highly relevant
[0.1, 0.29, 0.48], # Similar to first
[0.9, 0.1, 0.2], # Less relevant
[0.2, 0.4, 0.6] # Relevant but diverse
])
# Parameters
lambda_param = 0.7 # Balance relevance vs diversity
# Compute similarity to query
sim_to_query = cosine_similarity(doc_vectors, query_vector).flatten()
selected = []
candidate_indices = list(range(len(doc_vectors)))
while candidate_indices:
if not selected:
# Pick the most relevant first
idx = candidate_indices[np.argmax(sim_to_query[candidate_indices])]
selected.append(idx)
candidate_indices.remove(idx)
else:
mmr_scores = []
for i in candidate_indices:
sim_to_selected = max(cosine_similarity(
doc_vectors[i].reshape(1, -1),
doc_vectors[selected]
).flatten())
score = lambda_param * sim_to_query[i] - (1 - lambda_param) * sim_to_selected
mmr_scores.append(score)
idx = candidate_indices[np.argmax(mmr_scores)]
selected.append(idx)
candidate_indices.remove(idx)
print("MMR re-ranked indices:", selected) MMR re-ranked indices: [0, 3, 2, 1]
When to use it
Use MMR in vector search when you want to improve result diversity and reduce redundancy, especially in applications like document retrieval, recommendation systems, or question answering. Avoid MMR if you only need the single most relevant result or if diversity is not a priority, as it may reduce precision in favor of variety.
Key terms
| Term | Definition |
|---|---|
| Maximal Marginal Relevance (MMR) | An algorithm balancing relevance and diversity in ranked search results. |
| Relevance | Similarity of a vector to the query vector. |
| Diversity | Difference or dissimilarity among selected vectors to avoid redundancy. |
| Cosine similarity | A metric measuring the cosine of the angle between two vectors, indicating similarity. |
Key Takeaways
- MMR balances relevance and diversity to improve vector search results by reducing redundancy.
- It iteratively selects vectors maximizing query similarity while minimizing similarity to already chosen results.
- Use MMR when diverse, non-redundant results are more valuable than just top relevance.
- MMR requires tuning the balance parameter (lambda) to fit your application's needs.
- Cosine similarity is commonly used to measure relevance and diversity in MMR.