How to use Haystack RAGAS integration
Quick answer
Use Haystack's
RAGAS integration by combining a retriever like FAISS with a generator such as OpenAIGenerator. This enables retrieval-augmented generation by querying documents and generating answers in a single pipeline.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install haystack-ai openai faiss-cpu
Setup
Install the required packages and set your OpenAI API key as an environment variable.
- Install Haystack v2, OpenAI SDK, and FAISS for vector search:
pip install haystack-ai openai faiss-cpu Step by step
This example shows how to create a simple RAGAS pipeline with Haystack using FAISS as the retriever and OpenAIGenerator as the generator. It loads documents, indexes them, and runs a query to get a generated answer.
import os
from haystack import Pipeline
from haystack.document_stores import InMemoryDocumentStore
from haystack.nodes import OpenAIGenerator
from haystack.nodes import EmbeddingRetriever
from haystack_community.vectorstores.faiss import FAISS
# Set your OpenAI API key in environment
# It is recommended to set this outside the script in your environment
# os.environ["OPENAI_API_KEY"] = "your_api_key_here"
# Initialize document store
document_store = InMemoryDocumentStore()
# Sample documents
docs = [
{"content": "Haystack is an open-source framework for building search systems."},
{"content": "RAGAS stands for Retrieval-Augmented Generation as a Service."},
{"content": "OpenAI models can be used as generators in Haystack pipelines."}
]
# Write documents to store
document_store.write_documents(docs)
# Initialize retriever with FAISS
retriever = EmbeddingRetriever(
document_store=document_store,
embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)
# Update embeddings in document store
document_store.update_embeddings(retriever)
# Wrap retriever with FAISS vector store
faiss_retriever = FAISS(document_store=document_store, retriever=retriever)
# Initialize OpenAI generator
generator = OpenAIGenerator(model="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])
# Create RAGAS pipeline
pipeline = Pipeline()
pipeline.add_node(component=faiss_retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])
# Run query
query = "What is RAGAS?"
result = pipeline.run(query=query)
print("Generated answer:", result["answers"][0].answer) output
Generated answer: RAGAS stands for Retrieval-Augmented Generation as a Service, which combines document retrieval with AI generation to provide accurate and context-aware answers.
Common variations
You can customize the integration by:
- Using different retrievers like
BM25Retrieveror external vector stores. - Switching the generator model to other OpenAI models such as
gpt-4oorgpt-4o-mini. - Running the pipeline asynchronously by using async methods in Haystack.
Troubleshooting
- If you get empty answers, ensure your documents are indexed and embeddings are updated.
- Check your OpenAI API key is correctly set in
os.environ["OPENAI_API_KEY"]. - For FAISS errors, verify
faiss-cpuis installed and compatible with your system.
Key Takeaways
- Use Haystack's
Pipelineto combineFAISSretriever andOpenAIGeneratorfor RAGAS. - Always update document embeddings after adding documents for accurate retrieval.
- Set your OpenAI API key in environment variables to authenticate the generator.
- Customize retrievers and generators to fit your use case and model preferences.