What is the difference between dense and sparse retrieval
FAISS. Sparse retrieval relies on traditional keyword-based indexing like BM25, matching exact or partial terms for fast, interpretable search.VERDICT
dense retrieval for semantic, context-rich search in RAG; use sparse retrieval for fast, interpretable keyword matching and when computational resources are limited.| Method | Representation | Search type | Speed | Best for | Complexity |
|---|---|---|---|---|---|
| Dense retrieval | Continuous vector embeddings | Semantic similarity | Slower (requires vector search) | Contextual, semantic search | Higher (requires embedding models) |
| Sparse retrieval | Sparse term vectors (bag-of-words) | Keyword matching | Faster (inverted index) | Exact keyword search, interpretability | Lower (traditional IR methods) |
| Dense + ANN | Embeddings + Approximate Nearest Neighbor | Semantic similarity with speedup | Faster than brute-force dense | Large-scale semantic search | Moderate (indexing + embedding) |
| Sparse + BM25 | Term frequency and inverse document frequency | Weighted keyword matching | Very fast | Classic IR, baseline retrieval | Low |
Key differences
Dense retrieval encodes queries and documents into dense vector embeddings using neural networks, capturing semantic meaning beyond exact words. Sparse retrieval uses traditional inverted indexes and term frequency statistics to match keywords directly. Dense retrieval excels at understanding context and synonyms, while sparse retrieval is faster and more interpretable.
Side-by-side example: sparse retrieval with BM25
This example uses the rank_bm25 Python library to perform sparse retrieval based on keyword matching.
from rank_bm25 import BM25Okapi
import os
corpus = [
"The quick brown fox jumps over the lazy dog.",
"A fast fox and a lazy dog play in the yard.",
"The dog is quick and brown."
]
tokenized_corpus = [doc.lower().split() for doc in corpus]
bm25 = BM25Okapi(tokenized_corpus)
query = "quick fox"
tokenized_query = query.lower().split()
scores = bm25.get_scores(tokenized_query)
best_doc_index = scores.argmax()
print(f"Best matching document: {corpus[best_doc_index]}") Best matching document: The quick brown fox jumps over the lazy dog.
Equivalent example: dense retrieval with embeddings
This example uses OpenAI embeddings and FAISS for dense retrieval, matching semantic similarity.
import os
from openai import OpenAI
import faiss
import numpy as np
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
corpus = [
"The quick brown fox jumps over the lazy dog.",
"A fast fox and a lazy dog play in the yard.",
"The dog is quick and brown."
]
# Get embeddings for corpus
corpus_embeddings = []
for doc in corpus:
response = client.embeddings.create(
model="text-embedding-3-small",
input=doc
)
corpus_embeddings.append(response.data[0].embedding)
corpus_embeddings = np.array(corpus_embeddings).astype('float32')
# Build FAISS index
index = faiss.IndexFlatL2(corpus_embeddings.shape[1])
index.add(corpus_embeddings)
# Embed query
query = "quick fox"
response = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_embedding = np.array(response.data[0].embedding).astype('float32').reshape(1, -1)
# Search
D, I = index.search(query_embedding, k=1)
print(f"Best matching document: {corpus[I[0][0]]}") Best matching document: The quick brown fox jumps over the lazy dog.
When to use each
Use dense retrieval when:
- You need semantic understanding beyond keywords.
- Handling synonyms, paraphrases, or context-rich queries.
- Working with large-scale datasets where approximate nearest neighbor search is feasible.
Use sparse retrieval when:
- Speed and interpretability are priorities.
- Resources for embedding models are limited.
- Exact keyword matching suffices, such as legal or scientific text search.
| Scenario | Recommended retrieval type |
|---|---|
| Semantic question answering | Dense retrieval |
| Keyword-based document filtering | Sparse retrieval |
| Large-scale semantic search with GPUs | Dense retrieval + ANN |
| Simple keyword search on small datasets | Sparse retrieval |
Pricing and access
Dense retrieval typically requires embedding models which may incur API costs (e.g., OpenAI embeddings). Sparse retrieval uses open-source libraries with no API cost.
| Option | Free | Paid | API access |
|---|---|---|---|
| Sparse retrieval (BM25, Lucene) | Yes | No | No |
| Dense retrieval (OpenAI embeddings + FAISS) | No (embedding API calls cost) | Yes | Yes |
| Dense retrieval (Open-source models + FAISS) | Yes | No | No |
| Hybrid (Dense + Sparse) | Depends on components | Depends on components | Depends on components |
Key Takeaways
- Dense retrieval uses vector embeddings for semantic search, capturing meaning beyond keywords.
- Sparse retrieval relies on keyword matching with inverted indexes, offering speed and interpretability.
- Use dense retrieval for context-rich queries and sparse retrieval for fast, exact keyword search.
- Dense retrieval often requires API calls for embeddings, while sparse retrieval can be fully open-source.
- Hybrid approaches combine strengths of both for improved retrieval performance.