What is context recall in RAG evaluation
Context recall in Retrieval-Augmented Generation (RAG) evaluation measures how well the retrieved documents or knowledge snippets cover the relevant information needed to answer a query. It quantifies the fraction of ground-truth context that the retrieval system successfully returns to the language model for generating accurate responses.Context recall is a metric in Retrieval-Augmented Generation (RAG) evaluation that measures how completely the retrieval component recovers relevant context needed for accurate language model generation.How it works
Context recall evaluates the retrieval step in a RAG pipeline by comparing the set of documents or passages retrieved against a known set of relevant context (ground truth). It calculates the proportion of relevant context that the retriever successfully returns. Think of it like a librarian fetching books: context recall measures how many of the exact books needed to answer a question the librarian actually brings back.
High context recall means the language model receives most or all relevant information, improving answer accuracy. Low recall means important context is missing, leading to incomplete or incorrect responses.
Concrete example
Suppose a query has 5 relevant context passages identified as ground truth. The retriever returns 3 passages, 2 of which are relevant.
ground_truth = {'p1', 'p2', 'p3', 'p4', 'p5'}
retrieved = {'p2', 'p3', 'p6'}
# Calculate context recall
relevant_retrieved = ground_truth.intersection(retrieved)
context_recall = len(relevant_retrieved) / len(ground_truth)
print(f"Context Recall: {context_recall:.2f}") Context Recall: 0.40
When to use it
Use context recall to evaluate and improve the retrieval component in RAG systems, especially when you have ground-truth context annotations. It helps diagnose if retrieval is missing key information before generation. Avoid relying solely on context recall when ground truth is unavailable or incomplete; combine with other metrics like precision or end-task accuracy.
Key terms
| Term | Definition |
|---|---|
| Context recall | Fraction of relevant context retrieved in RAG evaluation. |
| Retrieval-Augmented Generation (RAG) | AI architecture combining retrieval with LLM generation. |
| Ground truth context | Known relevant documents or passages for a query. |
| Retriever | Component that fetches documents to provide context to the LLM. |
Key Takeaways
- Context recall measures how completely a RAG retriever returns relevant context.
- High context recall improves LLM answer accuracy by providing necessary information.
- Calculate context recall as relevant retrieved passages divided by total relevant passages.
- Use context recall to diagnose retrieval quality when ground truth context is available.