How to Intermediate · 4 min read

How to use Weaviate with Haystack

Q: How to use Weaviate with Haystack

Use WeaviateDocumentStore from haystack.document_stores to connect Haystack with a Weaviate instance. Index your documents and then use Haystack retrievers and generators to perform semantic search leveraging Weaviate's vector capabilities.

Quick answer

Use WeaviateDocumentStore from haystack.document_stores to connect Haystack with a Weaviate instance. Index your documents and then use Haystack retrievers and generators to perform semantic search leveraging Weaviate's vector capabilities.

PREREQUISITES

Python 3.8+
Weaviate instance running (local or cloud)
pip install haystack-ai weaviate-client
OpenAI API key or other embedding model API key

Setup

Install the necessary Python packages and ensure you have a running Weaviate instance. You also need an embedding model API key (e.g., OpenAI) for vectorizing documents.

bash

pip install farm-haystack[weaviate] weaviate-client openai

Step by step

This example demonstrates connecting Haystack to Weaviate, indexing documents, and performing a semantic search query.

python

import os
from haystack.document_stores import WeaviateDocumentStore
from haystack.nodes import EmbeddingRetriever, FARMReader
from haystack.pipelines import ExtractiveQAPipeline

# Configure Weaviate connection
weaviate_url = "http://localhost:8080"  # Change if using cloud

# Initialize WeaviateDocumentStore
document_store = WeaviateDocumentStore(
    url=weaviate_url,
    index="haystack-weaviate-index",
    embedding_dim=1536,  # dimension for OpenAI embeddings
    similarity="cosine",
    create_schema=True
)

# Initialize retriever with OpenAI embeddings
retriever = EmbeddingRetriever(
    document_store=document_store,
    embedding_model="text-embedding-3-small",
    api_key=os.environ["OPENAI_API_KEY"]
)

# Sample documents to index
docs = [
    {"content": "Haystack is an open source NLP framework.", "meta": {"source": "wiki"}},
    {"content": "Weaviate is a vector search engine.", "meta": {"source": "wiki"}}
]

# Write documents to Weaviate
document_store.write_documents(docs)

# Update embeddings in Weaviate
document_store.update_embeddings(retriever)

# Perform a semantic search
query = "What is Haystack?"
retrieved_docs = retriever.retrieve(query)

print("Top document:", retrieved_docs[0].content)

output

Top document: Haystack is an open source NLP framework.

Common variations

Use different embedding models by changing embedding_model in EmbeddingRetriever.
Use FARMReader or other readers for extractive QA pipelines.
Connect to a cloud-hosted Weaviate by changing the url and adding authentication parameters.
Use async versions of Haystack components if needed.

Troubleshooting

If documents do not appear in Weaviate, verify the url and network connectivity.
Ensure the embedding dimension matches the model used.
If update_embeddings fails, check your API key and internet connection.
For schema conflicts, delete the existing Weaviate index or use a new index name.

Key Takeaways

Use WeaviateDocumentStore to integrate Weaviate with Haystack for vector search.
Index documents and update embeddings before querying for best results.
Adjust embedding model and Weaviate connection settings for your environment.

Verified 2026-04 · text-embedding-3-small

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.