How to query Chroma vector store in python
Quick answer
To query a
Chroma vector store in Python, use the chromadb client to connect, then call query() on your collection with your query embedding and specify n_results. This returns the most similar vectors and their metadata for retrieval-augmented generation tasks.PREREQUISITES
Python 3.8+pip install chromadbA pre-populated Chroma collection with embeddingsBasic knowledge of vector embeddings
Setup
Install the chromadb Python package and import it. Ensure you have a Chroma collection with stored embeddings to query against.
pip install chromadb Step by step
This example demonstrates connecting to a local Chroma client, accessing a collection, and querying it with an embedding vector to retrieve the top 3 most similar documents.
import chromadb
from chromadb.config import Settings
# Initialize Chroma client (local in-memory)
client = chromadb.Client(Settings())
# Access your collection by name
collection = client.get_collection(name="my_collection")
# Example query embedding (replace with your actual embedding vector)
query_embedding = [0.1, 0.2, 0.3, 0.4, 0.5]
# Query the collection for top 3 similar vectors
results = collection.query(
query_embeddings=[query_embedding],
n_results=3
)
print("Query results:", results) output
Query results: {'ids': [['doc1', 'doc2', 'doc3']], 'distances': [[0.12, 0.15, 0.20]], 'metadatas': [[{'source': 'file1.txt'}, {'source': 'file2.txt'}, {'source': 'file3.txt'}]], 'documents': [['Text of doc1', 'Text of doc2', 'Text of doc3']]} Common variations
- Use async querying with
asyncioif supported by your Chroma client. - Query multiple embeddings at once by passing a list of embedding vectors.
- Adjust
n_resultsto control how many nearest neighbors you retrieve. - Use different distance metrics by configuring the collection at creation.
Troubleshooting
- If you get a
CollectionNotFoundError, verify your collection name and that it is created and populated. - If results are empty, check that your query embedding matches the vector dimension of stored embeddings.
- Ensure your Chroma client is running if using a server-based setup.
Key Takeaways
- Use the
chromadb.Clientto connect andget_collectionto access your vector store. - Call
collection.query()with your query embedding and specifyn_resultsto retrieve similar vectors. - Ensure your query embedding dimension matches the stored vectors to get meaningful results.