How to use Ollama embeddings in LlamaIndex
Quick answer
Use the
OllamaEmbeddings class from the llama_index.embeddings.ollama module to generate embeddings via the Ollama API and pass it to LlamaIndex for vector indexing and retrieval. Initialize OllamaEmbeddings with your Ollama model name and endpoint, then create the index with these embeddings.PREREQUISITES
Python 3.8+pip install llama-index ollamaOllama running locally or accessible APIBasic familiarity with LlamaIndex
Setup
Install the required Python packages and ensure Ollama is running locally or accessible via API. Set up environment variables if needed.
pip install llama-index ollama Step by step
This example shows how to create an Ollama embeddings instance, generate embeddings for documents, and build a LlamaIndex for retrieval.
import os
from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex
from llama_index.embeddings.ollama import OllamaEmbeddings
# Initialize Ollama embeddings with your model
embeddings = OllamaEmbeddings(model="llama2-7b", endpoint="http://localhost:11434")
# Load documents from a directory
documents = SimpleDirectoryReader('data').load_data()
# Create the vector index with Ollama embeddings
index = GPTVectorStoreIndex(documents, embed_model=embeddings)
# Query the index
query = "What is LlamaIndex?"
response = index.query(query)
print(response.response) output
LlamaIndex is a data framework that helps you build and query vector indices over your documents.
Common variations
- Use different Ollama models by changing the
modelparameter. - Configure Ollama endpoint if running remotely.
- Use async calls if supported by your environment.
Troubleshooting
- If you get connection errors, verify Ollama is running and the endpoint URL is correct.
- Ensure your documents are loaded correctly; check file paths.
- Check model compatibility and Ollama version.
Key Takeaways
- Use
OllamaEmbeddingsfromllama_index.embeddings.ollamato integrate Ollama embeddings with LlamaIndex. - Initialize
OllamaEmbeddingswith your Ollama model and endpoint before creating the vector index. - Ensure Ollama is running and accessible to avoid connection errors during embedding generation.