How to query ChromaDB
Quick answer
To query
ChromaDB, use the official Python client to connect to your collection, then call the query method with your query vectors and specify n_results for nearest neighbors. This returns the closest matching vectors and their metadata.PREREQUISITES
Python 3.8+pip install chromadbBasic knowledge of vector embeddingsChromaDB server or local instance running
Setup
Install the chromadb Python package and ensure you have a running ChromaDB instance (local or remote). Set up your environment with Python 3.8 or higher.
pip install chromadb Step by step
This example shows how to create a collection, insert vectors with metadata, and query the collection for nearest neighbors.
import chromadb
# Initialize client (default local persistent instance)
client = chromadb.Client()
# Create or get a collection
collection = client.get_or_create_collection(name="example_collection")
# Insert sample vectors with ids and metadata
collection.add(
ids=["vec1", "vec2", "vec3"],
embeddings=[
[0.1, 0.2, 0.3],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9]
],
metadatas=[
{"text": "first vector"},
{"text": "second vector"},
{"text": "third vector"}
]
)
# Query with a vector to find top 2 nearest neighbors
results = collection.query(
query_embeddings=[[0.1, 0.2, 0.25]],
n_results=2
)
print("IDs:", results["ids"])
print("Distances:", results["distances"])
print("Metadatas:", results["metadatas"]) output
IDs: [['vec1', 'vec2']]
Distances: [[0.0, 0.08660254037844388]]
Metadatas: [[{'text': 'first vector'}, {'text': 'second vector'}]] Common variations
- Use
client = chromadb.Client(chromadb.config.Settings(...))to customize connection settings. - Query multiple vectors at once by passing a list of query embeddings.
- Filter queries by metadata using the
whereparameter. - Use async patterns with
asyncioif integrating into async apps (requires custom wrappers).
Troubleshooting
- If you get connection errors, verify your ChromaDB server is running and accessible.
- Ensure your query vectors have the same dimensionality as inserted vectors.
- Check that
chromadbpackage is up to date to avoid API mismatches. - For large datasets, consider indexing strategies or batch queries to optimize performance.
Key Takeaways
- Use the official
chromadbPython client to query vector collections easily. - Always match query vector dimensions with stored embeddings to avoid errors.
- Leverage metadata filtering to refine search results in
ChromaDB. - Keep your
chromadbpackage updated for latest features and fixes.