Concept Intermediate · 3 min read

What is context recall in RAG evaluation

Q: What is context recall in RAG evaluation

Context recall in Retrieval-Augmented Generation (RAG) evaluation measures how well the retrieved documents or knowledge snippets cover the relevant information needed to answer a query. It quantifies the fraction of ground-truth context that the retrieval system successfully returns to the language model for generating accurate responses.

Quick answer

Context recall in Retrieval-Augmented Generation (RAG) evaluation measures how well the retrieved documents or knowledge snippets cover the relevant information needed to answer a query. It quantifies the fraction of ground-truth context that the retrieval system successfully returns to the language model for generating accurate responses.

Context recall is a metric in Retrieval-Augmented Generation (RAG) evaluation that measures how completely the retrieval component recovers relevant context needed for accurate language model generation.

How it works

Context recall evaluates the retrieval step in a RAG pipeline by comparing the set of documents or passages retrieved against a known set of relevant context (ground truth). It calculates the proportion of relevant context that the retriever successfully returns. Think of it like a librarian fetching books: context recall measures how many of the exact books needed to answer a question the librarian actually brings back.

High context recall means the language model receives most or all relevant information, improving answer accuracy. Low recall means important context is missing, leading to incomplete or incorrect responses.

Concrete example

Suppose a query has 5 relevant context passages identified as ground truth. The retriever returns 3 passages, 2 of which are relevant.

python

ground_truth = {'p1', 'p2', 'p3', 'p4', 'p5'}
retrieved = {'p2', 'p3', 'p6'}

# Calculate context recall
relevant_retrieved = ground_truth.intersection(retrieved)
context_recall = len(relevant_retrieved) / len(ground_truth)
print(f"Context Recall: {context_recall:.2f}")

output

Context Recall: 0.40

When to use it

Use context recall to evaluate and improve the retrieval component in RAG systems, especially when you have ground-truth context annotations. It helps diagnose if retrieval is missing key information before generation. Avoid relying solely on context recall when ground truth is unavailable or incomplete; combine with other metrics like precision or end-task accuracy.

Key terms

Term	Definition
Context recall	Fraction of relevant context retrieved in RAG evaluation.
Retrieval-Augmented Generation (RAG)	AI architecture combining retrieval with LLM generation.
Ground truth context	Known relevant documents or passages for a query.
Retriever	Component that fetches documents to provide context to the LLM.

✅

Key Takeaways

Context recall measures how completely a RAG retriever returns relevant context.
High context recall improves LLM answer accuracy by providing necessary information.
Calculate context recall as relevant retrieved passages divided by total relevant passages.
Use context recall to diagnose retrieval quality when ground truth context is available.

Verified 2026-04 · gpt-4o, claude-3-5-sonnet-20241022

Verify ↗