How to use OpenAI embeddings in LlamaIndex
Quick answer
Use the
OpenAIEmbeddings class from langchain_openai with your OpenAI API key, then pass it to LlamaIndex's ServiceContext to build an index with embedded vectors. This enables semantic search and retrieval using OpenAI's embedding models.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install llama-index langchain_openai openai>=1.0
Setup
Install the required packages and set your OpenAI API key as an environment variable.
- Install packages:
pip install llama-index langchain_openai openai - Set environment variable in your shell:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install llama-index langchain_openai openai Step by step
This example shows how to create an embedding model using OpenAI, build a LlamaIndex vector index from documents, and query it.
import os
from llama_index import SimpleDirectoryReader, ServiceContext, VectorStoreIndex
from langchain_openai import OpenAIEmbeddings
# Initialize OpenAI embeddings with API key from environment
embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
# Create a ServiceContext with the embeddings
service_context = ServiceContext.from_defaults(embed_model=embeddings)
# Load documents from a directory (replace 'docs' with your folder)
docs = SimpleDirectoryReader('docs').load_data()
# Build the vector index with the documents and embeddings
index = VectorStoreIndex.from_documents(docs, service_context=service_context)
# Query the index
query = "What is LlamaIndex?"
response = index.query(query)
print(response.response) output
LlamaIndex is a library that helps you build vector indexes over your documents for semantic search and retrieval.
Common variations
- Use different embedding models by passing other
embed_modelimplementations toServiceContext. - Use
GPTVectorStoreIndexinstead ofVectorStoreIndexfor older versions. - For async usage, wrap calls with async functions and use async-compatible clients.
Troubleshooting
- If you get authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If embeddings fail, check your network connection and API usage limits.
- Ensure document paths are correct to avoid file not found errors.
Key Takeaways
- Use
OpenAIEmbeddingsfromlangchain_openaito generate embeddings for LlamaIndex. - Pass the embeddings to
ServiceContextto integrate with LlamaIndex vector indexes. - Load your documents, build the index, then query it for semantic search capabilities.