How to Intermediate · 3 min read

How to use Chroma with LlamaIndex

Quick answer
Use the Chroma vector store as the storage backend for LlamaIndex by installing chromadb and configuring Chroma as the vector store in your index setup. This enables efficient similarity search and retrieval of documents indexed with LlamaIndex.

PREREQUISITES

  • Python 3.8+
  • pip install llama-index chromadb
  • OpenAI API key (free tier works)
  • Set environment variable OPENAI_API_KEY

Setup

Install the required packages llama-index and chromadb using pip. Ensure your OPENAI_API_KEY is set in your environment for LlamaIndex to access OpenAI models.

bash
pip install llama-index chromadb

Step by step

This example shows how to create a LlamaIndex index using Chroma as the vector store backend, add documents, and query the index.

python
import os
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.vector_stores import ChromaVectorStore
import chromadb

# Ensure your OpenAI API key is set
_ = os.environ["OPENAI_API_KEY"]

# Initialize Chroma client and vector store
chroma_client = chromadb.Client()
vector_store = ChromaVectorStore(client=chroma_client, collection_name="llamaindex_collection")

# Load documents from a directory (replace 'data/' with your docs path)
docs = SimpleDirectoryReader("data/").load_data()

# Create LlamaIndex index with Chroma vector store
index = GPTVectorStoreIndex(
    documents=docs,
    vector_store=vector_store,
    service_context=ServiceContext.from_defaults()
)

# Query the index
query = "What is the main topic of the documents?"
response = index.query(query)
print("Response:", response.response)
output
Response: The main topic of the documents is ...

Common variations

  • Use different OpenAI models by customizing ServiceContext with your preferred model.
  • Use async versions of LlamaIndex methods if your environment supports async.
  • Customize ChromaVectorStore parameters like collection_name or persistence directory.
python
from llama_index import ServiceContext

# Example: Use GPT-4o model
service_context = ServiceContext.from_defaults(model="gpt-4o")

index = GPTVectorStoreIndex(
    documents=docs,
    vector_store=vector_store,
    service_context=service_context
)

Troubleshooting

  • If you see ModuleNotFoundError for chromadb, ensure it is installed with pip install chromadb.
  • If queries return empty or irrelevant results, verify documents are loaded correctly and indexed.
  • Check your OPENAI_API_KEY environment variable is set and valid.

Key Takeaways

  • Use ChromaVectorStore as the vector store backend in LlamaIndex for efficient similarity search.
  • Install chromadb and set your OPENAI_API_KEY environment variable before running.
  • Customize ServiceContext to switch OpenAI models or adjust settings.
  • Load documents properly before indexing to ensure relevant query results.
  • Troubleshoot missing packages and environment variables to avoid runtime errors.
Verified 2026-04 · gpt-4o
Verify ↗