How to filter results in Chroma query
Quick answer
To filter results in a
Chroma query, use the where parameter to specify metadata conditions that documents must meet. This lets you restrict search results by tags, categories, or other stored metadata fields during vector similarity search.PREREQUISITES
Python 3.8+pip install chromadbBasic knowledge of vector search and metadata tagging
Setup
Install the chromadb Python client and set up a collection with documents and metadata for filtering.
pip install chromadb Step by step
This example shows how to add documents with metadata to a Chroma collection and query with a filter using the where parameter.
import chromadb
# Initialize client and collection
client = chromadb.Client()
collection = client.create_collection(name="example_collection")
# Add documents with metadata
collection.add(
documents=["Document about AI", "Document about cooking", "Document about AI ethics"],
metadatas=[
{"category": "technology"},
{"category": "cooking"},
{"category": "technology"}
],
ids=["doc1", "doc2", "doc3"]
)
# Query with filter to only get technology category
results = collection.query(
query_texts=["AI"],
n_results=2,
where={"category": "technology"}
)
print(results) output
{'ids': [['doc1', 'doc3']], 'distances': [[0.12, 0.15]], 'documents': [['Document about AI', 'Document about AI ethics']], 'metadatas': [[{'category': 'technology'}, {'category': 'technology'}]]} Common variations
You can filter on multiple metadata fields by passing a dictionary with multiple keys in where. For example, {"category": "technology", "author": "Alice"}. Async querying and different SDKs follow similar patterns but check their docs for exact syntax.
Troubleshooting
If your filtered query returns no results, verify that the metadata keys and values exactly match what you stored. Metadata filtering is case-sensitive and requires exact matches. Also, ensure documents have metadata assigned.
Key Takeaways
- Use the
whereparameter inChromaqueries to filter by metadata fields. - Metadata filters require exact key-value matches and are case-sensitive.
- Filtering improves retrieval relevance by restricting results to specific categories or tags.