How to use embedding cache in LangChain
Quick answer
Use
OpenAIEmbeddings with a persistent vector store like FAISS or Chroma in LangChain to cache embeddings locally. This avoids redundant API calls by storing and reusing embeddings for texts already processed.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install langchain_openai langchain_community faiss-cpu chromadb
Setup
Install necessary packages and set your environment variable for the OpenAI API key.
- Run
pip install langchain_openai langchain_community faiss-cpu chromadb - Set
export OPENAI_API_KEY='your_api_key'on macOS/Linux orsetx OPENAI_API_KEY "your_api_key"on Windows
pip install langchain_openai langchain_community faiss-cpu chromadb Step by step
This example shows how to create an embedding cache using FAISS vector store with OpenAIEmbeddings. It stores embeddings locally to avoid repeated API calls for the same texts.
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
# Initialize embeddings client
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])
# Sample documents to embed
texts = ["Hello world", "LangChain embedding cache example", "Hello world"]
# Create or load FAISS index
index_path = "faiss_index"
try:
# Load existing index from disk
vectorstore = FAISS.load_local(index_path, embeddings)
print("Loaded existing FAISS index from disk.")
except Exception:
# Create new index if not found
vectorstore = FAISS.from_texts(texts, embeddings)
vectorstore.save_local(index_path)
print("Created new FAISS index and saved to disk.")
# Query the vector store
query = "Hello"
results = vectorstore.similarity_search(query, k=2)
for i, doc in enumerate(results, 1):
print(f"Result {i}: {doc.page_content}") output
Created new FAISS index and saved to disk. Result 1: Hello world Result 2: Hello world
Common variations
You can use other vector stores like Chroma for persistent caching with similar APIs. Async usage is possible but requires async-compatible vector stores. Different embedding models can be swapped by changing OpenAIEmbeddings parameters or using other providers.
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])
# Using Chroma vector store for embedding cache
persist_directory = "./chroma_cache"
vectorstore = Chroma.from_texts(
texts=["Hello world", "LangChain embedding cache example"],
embedding=embeddings,
persist_directory=persist_directory
)
vectorstore.persist()
query = "Hello"
results = vectorstore.similarity_search(query, k=2)
for i, doc in enumerate(results, 1):
print(f"Result {i}: {doc.page_content}") output
Result 1: Hello world Result 2: LangChain embedding cache example
Troubleshooting
- If you see
FileNotFoundErrorwhen loading the index, ensure the index directory exists or create a new index. - If embeddings are not cached, verify the vector store's
save_localorpersistmethod is called. - Check your
OPENAI_API_KEYenvironment variable is set correctly to avoid authentication errors.
Key Takeaways
- Use persistent vector stores like FAISS or Chroma to cache embeddings locally in LangChain.
- Caching embeddings reduces redundant API calls and speeds up similarity searches.
- Always save or persist your vector store after adding embeddings to enable reuse.
- Set your API key securely via environment variables to avoid authentication issues.
- You can switch embedding models or vector stores without changing caching logic.