How to beginner · 3 min read

How to filter documents by metadata in LlamaIndex

Quick answer
In LlamaIndex, filter documents by metadata using the metadata_filter parameter in query methods like query or as_retriever. Pass a dictionary specifying key-value pairs to match metadata fields, enabling precise document retrieval based on metadata.

PREREQUISITES

  • Python 3.8+
  • pip install llama-index>=0.6.0
  • Basic knowledge of Python and LlamaIndex

Setup

Install llama-index via pip and prepare your environment.

bash
pip install llama-index>=0.6.0

Step by step

This example shows how to create documents with metadata, build an index, and filter documents by metadata during querying.

python
from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex, Document

# Create documents with metadata
documents = [
    Document(text="Document about cats.", metadata={"category": "animals", "type": "pet"}),
    Document(text="Document about dogs.", metadata={"category": "animals", "type": "pet"}),
    Document(text="Document about cars.", metadata={"category": "vehicles", "type": "transport"}),
]

# Build the index
index = GPTVectorStoreIndex(documents)

# Define metadata filter to get only documents in category 'animals'
metadata_filter = {"category": "animals"}

# Query the index with metadata filter
response = index.query(
    "Tell me about pets.",
    metadata_filter=metadata_filter
)

print(response.response)
output
Document about cats.
Document about dogs.

Common variations

You can also apply metadata filters when using retrievers or other query interfaces. For example, with as_retriever():

python
retriever = index.as_retriever(metadata_filter={"type": "pet"})
results = retriever.retrieve("Tell me about pets.")
for doc in results:
    print(doc.text)
output
Document about cats.
Document about dogs.

Troubleshooting

  • If no documents are returned, verify your metadata keys and values exactly match those in your documents.
  • Ensure metadata is provided as a dictionary when creating Document instances.
  • Check that your llama-index version supports metadata_filter (version 0.6.0+).

Key Takeaways

  • Use the metadata_filter parameter to restrict queries to documents matching specific metadata.
  • Metadata must be set as a dictionary when creating Document objects for filtering to work.
  • Filtering by metadata works both in direct queries and retriever interfaces.
  • Verify metadata keys and values carefully to avoid empty query results.
Verified 2026-04
Verify ↗