How to use Vertex AI embeddings with LangChain
Quick answer
Use the
vertexai SDK to generate embeddings with a Vertex AI embedding model, then wrap those embeddings with OpenAIEmbeddings or a custom LangChain embeddings class. Initialize LangChain vectorstores like FAISS with these embeddings for semantic search. This enables seamless integration of Vertex AI embeddings in LangChain pipelines.PREREQUISITES
Python 3.8+Google Cloud project with Vertex AI enabledGoogle Cloud SDK installed and authenticatedpip install vertexai langchain langchain_community faiss-cpu
Setup
Install the required Python packages and authenticate your Google Cloud environment to use Vertex AI.
- Install packages:
vertexai,langchain,langchain_community, andfaiss-cpufor vector search. - Authenticate with Google Cloud using
gcloud auth application-default loginor setGOOGLE_APPLICATION_CREDENTIALSenvironment variable.
pip install vertexai langchain langchain_community faiss-cpu Step by step
This example shows how to generate embeddings from Vertex AI and use them with LangChain's FAISS vectorstore for semantic search.
import os
import vertexai
from vertexai.language_models import TextEmbeddingModel
from langchain_community.vectorstores import FAISS
from langchain.schema import Document
# Initialize Vertex AI SDK
vertexai.init(project=os.environ['GOOGLE_CLOUD_PROJECT'], location='us-central1')
# Load Vertex AI embedding model
embedding_model = TextEmbeddingModel.from_pretrained('textembedding-gecko@001')
# Define a wrapper class to use Vertex AI embeddings with LangChain
class VertexAIEmbeddings:
def embed_documents(self, texts):
embeddings = embedding_model.get_embeddings(texts)
return [e.values for e in embeddings]
def embed_query(self, text):
embedding = embedding_model.get_embeddings([text])[0]
return embedding.values
# Sample documents
texts = [
"LangChain enables easy LLM integrations.",
"Vertex AI provides powerful embedding models.",
"FAISS is a fast vector search library."
]
# Create LangChain documents
docs = [Document(page_content=t) for t in texts]
# Instantiate embeddings
embeddings = VertexAIEmbeddings()
# Create FAISS vectorstore from documents
vectorstore = FAISS.from_documents(docs, embeddings)
# Query vectorstore
query = "What library helps with vector search?"
query_embedding = embeddings.embed_query(query)
# Retrieve top 2 relevant documents
results = vectorstore.similarity_search_by_vector(query_embedding, k=2)
for i, doc in enumerate(results, 1):
print(f"Result {i}: {doc.page_content}") output
Result 1: FAISS is a fast vector search library. Result 2: LangChain enables easy LLM integrations.
Common variations
- Use async Vertex AI calls by adapting
vertexaiSDK methods withasyncio. - Swap
FAISSwith other LangChain vectorstores likeChromaorWeaviate. - Change embedding model by specifying a different Vertex AI embedding model name in
TextEmbeddingModel.from_pretrained().
Troubleshooting
- If you get authentication errors, ensure your
GOOGLE_APPLICATION_CREDENTIALSenv var points to a valid service account JSON key with Vertex AI permissions. - If embeddings are empty or errors occur, verify the model name
textembedding-gecko@001is available in your region. - For vectorstore errors, confirm
faiss-cpuis installed and compatible with your Python version.
Key Takeaways
- Use the
vertexaiSDK to generate embeddings with Vertex AI models. - Wrap Vertex AI embeddings in a LangChain-compatible class for vectorstores.
- Integrate with LangChain vectorstores like
FAISSfor semantic search. - Authenticate properly with Google Cloud to avoid permission issues.
- You can switch embedding models or vectorstores easily within this pattern.