How to Intermediate · 4 min read

How to use Elasticsearch with Haystack

Quick answer
Use Haystack with an ElasticsearchDocumentStore to index and retrieve documents, then connect it to a Retriever and Generator for AI-powered search. Install haystack-ai and elasticsearch, configure the document store, and run queries with the pipeline.

PREREQUISITES

  • Python 3.8+
  • pip install haystack-ai>=2.0
  • Elasticsearch server running (7.x or 8.x)
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the required Python packages and ensure you have a running Elasticsearch instance. You can run Elasticsearch locally via Docker or use a managed service.

  • Install Haystack and Elasticsearch client:
bash
pip install haystack-ai elasticsearch openai

Step by step

This example shows how to create an ElasticsearchDocumentStore, index documents, and run a query using Haystack's Pipeline with OpenAI's GPT model for answer generation.

python
import os
from haystack import Pipeline
from haystack.document_stores import ElasticsearchDocumentStore
from haystack.nodes import OpenAIGenerator, BM25Retriever

# Configure Elasticsearch document store
document_store = ElasticsearchDocumentStore(
    host="localhost",
    port=9200,
    username="",
    password="",
    index="haystack-docs"
)

# Sample documents to index
docs = [
    {"content": "Elasticsearch is a distributed, RESTful search engine.", "meta": {"source": "wiki"}},
    {"content": "Haystack is a framework for building search systems.", "meta": {"source": "wiki"}}
]

# Write documents to Elasticsearch
document_store.write_documents(docs)

# Initialize retriever and generator
retriever = BM25Retriever(document_store=document_store)
generator = OpenAIGenerator(api_key=os.environ["OPENAI_API_KEY"], model="gpt-4o-mini")

# Build pipeline
pipeline = Pipeline()
pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])

# Run query
query = "What is Elasticsearch?"
result = pipeline.run(query=query, params={"Retriever": {"top_k": 3}})

print("Answer:", result["answers"][0].answer)
output
Answer: Elasticsearch is a distributed, RESTful search engine.

Common variations

You can use different retrievers like DPRRetriever for dense retrieval or switch to other generators such as OpenAIChatCompletion for chat-based models. Async pipelines are also supported in Haystack 2.x.

python
from haystack.nodes import OpenAIChatCompletion, DPRRetriever

# Example with DPRRetriever and chat model
retriever = DPRRetriever(document_store=document_store)
generator = OpenAIChatCompletion(api_key=os.environ["OPENAI_API_KEY"], model="gpt-4o")

pipeline = Pipeline()
pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])

result = pipeline.run(query="Explain Haystack framework", params={"Retriever": {"top_k": 5}})
print("Chat answer:", result["answers"][0].answer)
output
Chat answer: Haystack is an open-source framework designed to build powerful search systems that combine traditional search with AI models.

Troubleshooting

  • If you get connection errors, verify your Elasticsearch server is running and accessible at the configured host and port.
  • Ensure the Elasticsearch version is compatible with Haystack (7.x or 8.x).
  • If indexing fails, check document format and permissions.
  • For OpenAI errors, confirm your OPENAI_API_KEY environment variable is set correctly.

Key Takeaways

  • Use ElasticsearchDocumentStore in Haystack to leverage Elasticsearch for document indexing and retrieval.
  • Combine retrievers like BM25Retriever with generators such as OpenAIGenerator for AI-powered search answers.
  • Ensure Elasticsearch is running and accessible before indexing or querying documents.
  • Haystack supports multiple retriever and generator variations for flexible search pipelines.
  • Set your OpenAI API key in os.environ["OPENAI_API_KEY"] to enable generation.
Verified 2026-04 · gpt-4o-mini, gpt-4o
Verify ↗