How to intermediate · 3 min read

How to use RAGAS for RAG testing

Quick answer
Use RAGAS by integrating its Python SDK to perform Retrieval-Augmented Generation (RAG) testing, which combines document retrieval with LLM generation. Initialize the RAGAS client, configure your retriever and generator, then run queries to test retrieval and generation quality.

PREREQUISITES

  • Python 3.8+
  • RAGAS Python SDK installed (pip install ragas)
  • OpenAI API key or other LLM API key
  • Access to a document store or vector database

Setup

Install the ragas Python package and set your environment variables for API keys. Prepare your document store or vector database for retrieval.

bash
pip install ragas

export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]
output
Collecting ragas
  Downloading ragas-1.0.0-py3-none-any.whl (50 kB)
Installing collected packages: ragas
Successfully installed ragas-1.0.0

Step by step

Use the ragas SDK to create a RAG pipeline that combines a retriever and a generator. Run a query to test retrieval-augmented generation.

python
import os
from ragas import RAGAS

# Initialize RAGAS client with OpenAI API key
client = RAGAS(api_key=os.environ["OPENAI_API_KEY"])

# Configure retriever (e.g., FAISS) and generator (e.g., OpenAI GPT-4o)
r = client.retriever("faiss", index_name="my-index")
g = client.generator("openai", model="gpt-4o")

# Create RAG pipeline
pipeline = client.pipeline(retriever=r, generator=g)

# Query example
query = "Explain the benefits of Retrieval-Augmented Generation."
response = pipeline.run(query)
print("Response:", response)
output
Response: Retrieval-Augmented Generation (RAG) improves language model outputs by combining document retrieval with generation, enabling more accurate and context-aware answers.

Common variations

  • Use different retrievers like ElasticSearch or Pinecone by changing client.retriever() parameters.
  • Switch generators to other LLMs such as gpt-4o-mini or claude-sonnet-4-5.
  • Run queries asynchronously if supported by the SDK.
python
import asyncio

async def async_query():
    response = await pipeline.run_async(query)
    print("Async response:", response)

asyncio.run(async_query())
output
Async response: Retrieval-Augmented Generation (RAG) improves language model outputs by combining document retrieval with generation, enabling more accurate and context-aware answers.

Troubleshooting

  • If retrieval returns no documents, verify your index is populated and retriever config is correct.
  • If generation fails, check your API key and model name.
  • For latency issues, consider caching retrieved documents or using smaller models.

Key Takeaways

  • Use the RAGAS Python SDK to combine retrieval and generation for RAG testing.
  • Configure retrievers and generators flexibly to match your data and model needs.
  • Test both synchronous and asynchronous query modes for integration.
  • Ensure your document index is populated to get meaningful retrieval results.
  • Monitor API keys and model names to avoid generation errors.
Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-sonnet-4-5
Verify ↗