How to beginner · 3 min read

How to use embedding cache in LangChain

Q: How to use embedding cache in LangChain

Use OpenAIEmbeddings with a persistent vector store like FAISS or Chroma in LangChain to cache embeddings locally. This avoids redundant API calls by storing and reusing embeddings for texts already processed.

Quick answer

Use OpenAIEmbeddings with a persistent vector store like FAISS or Chroma in LangChain to cache embeddings locally. This avoids redundant API calls by storing and reusing embeddings for texts already processed.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install langchain_openai langchain_community faiss-cpu chromadb

Setup

Install necessary packages and set your environment variable for the OpenAI API key.

Run pip install langchain_openai langchain_community faiss-cpu chromadb
Set export OPENAI_API_KEY='your_api_key' on macOS/Linux or setx OPENAI_API_KEY "your_api_key" on Windows

bash

pip install langchain_openai langchain_community faiss-cpu chromadb

Step by step

This example shows how to create an embedding cache using FAISS vector store with OpenAIEmbeddings. It stores embeddings locally to avoid repeated API calls for the same texts.

python

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

# Initialize embeddings client
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents to embed
texts = ["Hello world", "LangChain embedding cache example", "Hello world"]

# Create or load FAISS index
index_path = "faiss_index"
try:
    # Load existing index from disk
    vectorstore = FAISS.load_local(index_path, embeddings)
    print("Loaded existing FAISS index from disk.")
except Exception:
    # Create new index if not found
    vectorstore = FAISS.from_texts(texts, embeddings)
    vectorstore.save_local(index_path)
    print("Created new FAISS index and saved to disk.")

# Query the vector store
query = "Hello"
results = vectorstore.similarity_search(query, k=2)
for i, doc in enumerate(results, 1):
    print(f"Result {i}: {doc.page_content}")

output

Created new FAISS index and saved to disk.
Result 1: Hello world
Result 2: Hello world

Common variations

You can use other vector stores like Chroma for persistent caching with similar APIs. Async usage is possible but requires async-compatible vector stores. Different embedding models can be swapped by changing OpenAIEmbeddings parameters or using other providers.

python

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])

# Using Chroma vector store for embedding cache
persist_directory = "./chroma_cache"
vectorstore = Chroma.from_texts(
    texts=["Hello world", "LangChain embedding cache example"],
    embedding=embeddings,
    persist_directory=persist_directory
)
vectorstore.persist()

query = "Hello"
results = vectorstore.similarity_search(query, k=2)
for i, doc in enumerate(results, 1):
    print(f"Result {i}: {doc.page_content}")

output

Result 1: Hello world
Result 2: LangChain embedding cache example

Troubleshooting

If you see FileNotFoundError when loading the index, ensure the index directory exists or create a new index.
If embeddings are not cached, verify the vector store's save_local or persist method is called.
Check your OPENAI_API_KEY environment variable is set correctly to avoid authentication errors.

✅

Key Takeaways

Use persistent vector stores like FAISS or Chroma to cache embeddings locally in LangChain.
Caching embeddings reduces redundant API calls and speeds up similarity searches.
Always save or persist your vector store after adding embeddings to enable reuse.
Set your API key securely via environment variables to avoid authentication issues.
You can switch embedding models or vector stores without changing caching logic.

Verified 2026-04 · gpt-4o, OpenAIEmbeddings

Verify ↗