How to beginner · 3 min read

How to load database with LlamaIndex

Q: How to load database with LlamaIndex

Use LlamaIndex to load a database by creating a SimpleDirectoryReader or other loader to ingest documents, then build an Index from them. This index acts as your database for querying with AI models.

Quick answer

Use LlamaIndex to load a database by creating a SimpleDirectoryReader or other loader to ingest documents, then build an Index from them. This index acts as your database for querying with AI models.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install llama-index openai

Setup

Install the llama-index package and set your OpenAI API key as an environment variable.

Install with pip install llama-index openai
Set environment variable in your shell: export OPENAI_API_KEY='your_api_key'

bash

pip install llama-index openai

Step by step

This example loads text documents from a directory, builds a LlamaIndex index (your database), and queries it using OpenAI's GPT-4o model.

python

import os
from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex, LLMPredictor, ServiceContext
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Load documents from a directory
loader = SimpleDirectoryReader('data')  # 'data' folder with text files
documents = loader.load_data()

# Setup LLM predictor with OpenAI GPT-4o
llm_predictor = LLMPredictor(llm=lambda prompt: client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": prompt}]))

# Create service context
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

# Build the index (database) from documents
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

# Query the index
query = "What is the main topic of the documents?"
response = index.query(query)
print(response.response)

output

The main topic of the documents is ... (depends on your data)

Common variations

Use GPTListIndex or GPTTreeIndex for different indexing strategies.
Load documents from PDFs or web pages using other loaders.
Use async calls with OpenAI SDK for concurrency.
Switch to other LLMs by changing the LLMPredictor setup.

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
If documents fail to load, check the directory path and file formats.
For slow queries, consider using smaller models or caching the index.

Key Takeaways

Use LlamaIndex loaders to ingest documents into an index acting as your database.
Build the index with GPTVectorStoreIndex or other index types depending on your use case.
Set up OpenAI client with environment variables and integrate with LlamaIndex via LLMPredictor.
You can query the index with natural language to retrieve relevant information.
Troubleshoot by verifying API keys, data paths, and model configurations.

Verified 2026-04 · gpt-4o

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.