How to create vector store in LangChain
Quick answer
Use
OpenAIEmbeddings to generate vector embeddings and FAISS from langchain_community.vectorstores to create a vector store. Load your documents, embed them, and then build the vector store for semantic search.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install langchain_openai langchain_community faiss-cpu
Setup
Install the required packages and set your OpenAI API key in the environment variables.
- Install packages:
pip install langchain_openai langchain_community faiss-cpu - Set environment variable:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install langchain_openai langchain_community faiss-cpu Step by step
This example loads text documents, creates OpenAI embeddings, and builds a FAISS vector store for semantic search.
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
# Load documents from local text files
loader = TextLoader("example.txt")
docs = loader.load()
# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])
# Create FAISS vector store from documents
vector_store = FAISS.from_documents(docs, embeddings)
# Save the vector store locally
vector_store.save_local("faiss_index")
# Example query: search top 3 relevant docs
query = "What is LangChain?"
results = vector_store.similarity_search(query, k=3)
for i, doc in enumerate(results, 1):
print(f"Result {i}: {doc.page_content}") output
Result 1: LangChain is a framework for building applications with LLMs. Result 2: LangChain supports vector stores for semantic search. Result 3: You can create vector stores using FAISS and OpenAI embeddings.
Common variations
- Use other vector stores like
ChromaorWeaviateinstead of FAISS. - Use different embedding models by swapping
OpenAIEmbeddingswith other providers. - Load documents from PDFs or other formats using
PyPDFLoaderor custom loaders. - Use async methods if your environment supports it.
Troubleshooting
- If you get
ModuleNotFoundError, ensurefaiss-cpuis installed correctly. - If embeddings fail, verify your
OPENAI_API_KEYis set and valid. - For large document sets, consider batching embeddings to avoid rate limits.
- If similarity search returns empty, check document loading and embedding steps.
Key Takeaways
- Use
OpenAIEmbeddingswithFAISSto create efficient vector stores in LangChain. - Load documents with LangChain loaders like
TextLoaderorPyPDFLoaderbefore embedding. - Save and reuse vector stores locally to optimize performance and avoid recomputing embeddings.
- Swap vector stores or embedding models easily to fit your use case.
- Ensure environment variables and dependencies are correctly set to avoid common errors.