How to beginner · 3 min read

How to load database with LlamaIndex

Quick answer
Use LlamaIndex to load a database by creating a SimpleDirectoryReader or other loader to ingest documents, then build an Index from them. This index acts as your database for querying with AI models.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install llama-index openai

Setup

Install the llama-index package and set your OpenAI API key as an environment variable.

  • Install with pip install llama-index openai
  • Set environment variable in your shell: export OPENAI_API_KEY='your_api_key'
bash
pip install llama-index openai

Step by step

This example loads text documents from a directory, builds a LlamaIndex index (your database), and queries it using OpenAI's GPT-4o model.

python
import os
from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex, LLMPredictor, ServiceContext
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Load documents from a directory
loader = SimpleDirectoryReader('data')  # 'data' folder with text files
documents = loader.load_data()

# Setup LLM predictor with OpenAI GPT-4o
llm_predictor = LLMPredictor(llm=lambda prompt: client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": prompt}]))

# Create service context
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

# Build the index (database) from documents
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

# Query the index
query = "What is the main topic of the documents?"
response = index.query(query)
print(response.response)
output
The main topic of the documents is ... (depends on your data)

Common variations

  • Use GPTListIndex or GPTTreeIndex for different indexing strategies.
  • Load documents from PDFs or web pages using other loaders.
  • Use async calls with OpenAI SDK for concurrency.
  • Switch to other LLMs by changing the LLMPredictor setup.

Troubleshooting

  • If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
  • If documents fail to load, check the directory path and file formats.
  • For slow queries, consider using smaller models or caching the index.

Key Takeaways

  • Use LlamaIndex loaders to ingest documents into an index acting as your database.
  • Build the index with GPTVectorStoreIndex or other index types depending on your use case.
  • Set up OpenAI client with environment variables and integrate with LlamaIndex via LLMPredictor.
  • You can query the index with natural language to retrieve relevant information.
  • Troubleshoot by verifying API keys, data paths, and model configurations.
Verified 2026-04 · gpt-4o
Verify ↗