How to beginner · 3 min read

How to create a VectorStoreIndex in LlamaIndex

Q: How to create a VectorStoreIndex in LlamaIndex

Use VectorStoreIndex from llama_index by first loading your documents, then creating a vector store with an embedding model, and finally initializing the index with those documents. This enables efficient semantic search over your data using vector similarity.

Quick answer

Use VectorStoreIndex from llama_index by first loading your documents, then creating a vector store with an embedding model, and finally initializing the index with those documents. This enables efficient semantic search over your data using vector similarity.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install llama-index openai

Setup

Install the llama-index package and set your OpenAI API key as an environment variable to enable embedding generation.

bash

pip install llama-index openai

Step by step

This example loads simple text documents, creates a VectorStoreIndex using OpenAI embeddings, and queries the index.

python

import os
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.embeddings.openai import OpenAIEmbedding

# Set your OpenAI API key in environment variable
# export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]

# Load documents from a directory (replace 'data' with your folder)
docs = SimpleDirectoryReader('data').load_data()

# Initialize embedding model
embedding_model = OpenAIEmbedding(api_key=os.environ["OPENAI_API_KEY"])

# Create service context with embedding model
service_context = ServiceContext.from_defaults(embed_model=embedding_model)

# Create VectorStoreIndex with documents and service context
index = VectorStoreIndex.from_documents(docs, service_context=service_context)

# Query the index
query = "What is the main topic of the documents?"
response = index.query(query)
print(response.response)

output

The main topic of the documents is ... (depends on your data)

Common variations

Use different embedding models by swapping OpenAIEmbedding with other supported embeddings.
Create the index asynchronously using async methods if your environment supports it.
Integrate with other vector stores like FAISS or Pinecone by customizing the vector store backend.

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
If documents fail to load, check the path and file formats supported by SimpleDirectoryReader.
For slow queries, consider reducing document size or using a smaller embedding model.

✅

Key Takeaways

Use VectorStoreIndex.from_documents with a ServiceContext embedding model to create the index.
Load documents with SimpleDirectoryReader or other loaders compatible with LlamaIndex.
Set your OpenAI API key in os.environ["OPENAI_API_KEY"] before running the code.
You can customize embeddings and vector stores for different use cases and performance needs.

Verified 2026-04 · gpt-4o, OpenAIEmbedding

Verify ↗