Concept Intermediate · 3 min read

What is answer relevancy in RAG evaluation

Quick answer
In Retrieval-Augmented Generation (RAG) evaluation, answer relevancy measures how closely the generated answer corresponds to the retrieved documents and the original query. It assesses whether the answer is factually grounded, contextually appropriate, and useful based on the retrieved knowledge.
Answer relevancy in RAG evaluation is the metric that quantifies how well a generated answer aligns with the retrieved documents and query context to ensure factual and contextual correctness.

How it works

Answer relevancy in RAG evaluation works by comparing the generated answer against the documents retrieved from a knowledge base that the model uses as context. Imagine a librarian fetching books (retrieved documents) to answer a question; answer relevancy checks if the librarian’s answer truly reflects the information in those books rather than inventing unrelated facts. This ensures the language model’s output is grounded and trustworthy.

Concrete example

Suppose a RAG system retrieves three documents about the Eiffel Tower and generates an answer to the query "When was the Eiffel Tower built?". Answer relevancy evaluates if the answer "The Eiffel Tower was built in 1887-1889" is supported by the retrieved documents.

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

query = "When was the Eiffel Tower built?"
retrieved_docs = [
    "The Eiffel Tower construction started in 1887.",
    "It was completed in 1889 for the 1889 World's Fair.",
    "Located in Paris, it is a famous landmark."
]

# Simulate answer generation with retrieval context
messages = [
    {"role": "system", "content": "You are a helpful assistant using retrieved documents."},
    {"role": "user", "content": query},
    {"role": "system", "content": "Context: " + ' '.join(retrieved_docs)}
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

answer = response.choices[0].message.content
print("Generated answer:", answer)

# Answer relevancy check (conceptual):
# Does the answer mention 1887-1889 and align with retrieved docs?
relevancy = "Yes" if "1887" in answer and "1889" in answer else "No"
print("Answer relevancy:", relevancy)
output
Generated answer: The Eiffel Tower was built between 1887 and 1889 for the 1889 World's Fair in Paris.
Answer relevancy: Yes

When to use it

Use answer relevancy evaluation in RAG systems when you need to ensure that generated answers are factually grounded in retrieved knowledge, such as in customer support, medical information retrieval, or legal document summarization. Avoid relying solely on relevancy when creative or open-ended generation is required, as strict grounding may limit generative flexibility.

Key terms

TermDefinition
Answer relevancyDegree to which a generated answer aligns with retrieved documents and query context.
Retrieval-Augmented Generation (RAG)An AI architecture combining retrieval of documents with language model generation to produce grounded answers.
Retrieved documentsExternal knowledge sources fetched to provide context for answer generation.
GroundingEnsuring generated content is based on factual, external information rather than hallucination.

Key Takeaways

  • Answer relevancy ensures generated answers are factually supported by retrieved documents in RAG systems.
  • Evaluating answer relevancy improves trustworthiness and accuracy in knowledge-grounded AI applications.
  • Use answer relevancy metrics when factual correctness is critical, such as in legal or medical domains.
Verified 2026-04 · gpt-4o-mini
Verify ↗