How to filter with Pinecone metadata
Quick answer
Use the
filter parameter in the index.query() method to restrict Pinecone vector search results by metadata. The filter is a dictionary specifying conditions on metadata fields, enabling precise retrieval based on attributes.PREREQUISITES
Python 3.8+Pinecone API keypip install pinecone-client>=3.0.0
Setup
Install the official Pinecone Python client and set your API key as an environment variable.
pip install pinecone-client output
Collecting pinecone-client Downloading pinecone_client-3.x.x-py3-none-any.whl (xx kB) Installing collected packages: pinecone-client Successfully installed pinecone-client-3.x.x
Step by step
This example demonstrates how to initialize Pinecone, create an index, upsert vectors with metadata, and query with a metadata filter.
import os
from pinecone import Pinecone
# Initialize Pinecone client
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
# Connect to an existing index or create one
index_name = "example-index"
if index_name not in pc.list_indexes():
pc.create_index(index_name, dimension=4)
index = pc.Index(index_name)
# Upsert vectors with metadata
vectors = [
("vec1", [0.1, 0.2, 0.3, 0.4], {"category": "books", "author": "Alice"}),
("vec2", [0.2, 0.1, 0.4, 0.3], {"category": "books", "author": "Bob"}),
("vec3", [0.9, 0.8, 0.7, 0.6], {"category": "movies", "director": "Alice"})
]
index.upsert(vectors)
# Query with metadata filter to find vectors in category 'books'
query_vector = [0.1, 0.2, 0.3, 0.4]
filter = {"category": {"$eq": "books"}}
response = index.query(
vector=query_vector,
top_k=2,
filter=filter,
include_metadata=True
)
print("Filtered query results:")
for match in response.matches:
print(f"ID: {match.id}, Score: {match.score}, Metadata: {match.metadata}") output
Filtered query results:
ID: vec1, Score: 0.987654, Metadata: {'category': 'books', 'author': 'Alice'}
ID: vec2, Score: 0.876543, Metadata: {'category': 'books', 'author': 'Bob'} Common variations
- Use complex filters combining multiple metadata fields with
$and,$or, and comparison operators like$gt,$lt. - Filter on numeric metadata fields for range queries.
- Use async Pinecone client for asynchronous querying.
filter_complex = {
"$and": [
{"category": {"$eq": "books"}},
{"author": {"$in": ["Alice", "Carol"]}}
]
}
response = index.query(
vector=query_vector,
top_k=3,
filter=filter_complex,
include_metadata=True
)
for match in response.matches:
print(match.id, match.metadata) output
vec1 {'category': 'books', 'author': 'Alice'} Troubleshooting
- If you get
Invalid filtererrors, verify your filter dictionary uses supported operators like$eq,$in,$and, and that metadata keys exist. - Ensure your Pinecone index has metadata indexed by upserting vectors with metadata included.
- Check your Pinecone environment and API key are correctly set.
Key Takeaways
- Use the
filterparameter inindex.query()to restrict results by metadata. - Filters support logical operators like
$and,$or, and comparison operators like$eq. - Always upsert vectors with metadata to enable filtering on those fields.
- Test filters with simple queries before using complex nested conditions.
- Ensure your Pinecone API key and environment are correctly configured.