How to use retriever in LangChain
Quick answer
Use a
retriever in LangChain by first creating embeddings with OpenAIEmbeddings, loading documents into a vector store like FAISS, and then calling retriever on the vector store to perform similarity search. This enables efficient retrieval of relevant documents for downstream tasks.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install langchain_openai langchain_community faiss-cpu
Setup
Install the required packages and set your OpenAI API key as an environment variable.
- Install LangChain OpenAI and community packages, plus FAISS for vector search:
pip install langchain_openai langchain_community faiss-cpu Step by step
This example loads text documents, creates embeddings with OpenAIEmbeddings, indexes them in a FAISS vector store, and uses the retriever to find relevant documents for a query.
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
# Set your OpenAI API key in environment variable before running
# export OPENAI_API_KEY='your_api_key'
# Load documents from local text files
loader = TextLoader("example_docs/doc1.txt")
docs = loader.load()
# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
# Create FAISS vector store from documents
vectorstore = FAISS.from_documents(docs, embeddings)
# Get retriever from vector store
retriever = vectorstore.as_retriever()
# Query the retriever
query = "What is LangChain?"
results = retriever.get_relevant_documents(query)
# Print retrieved document contents
for i, doc in enumerate(results):
print(f"Document {i+1}: {doc.page_content}\n") output
Document 1: LangChain is a framework for developing applications powered by language models. Document 2: LangChain enables chaining of LLM calls with other components like retrievers and memory.
Common variations
You can customize the retriever behavior by setting parameters like search_type (e.g., 'similarity', 'mmr'), search_kwargs (e.g., {"k": 5} for number of results), or use different vector stores such as Chroma. Async retrieval is also supported in some vector stores.
from langchain_community.vectorstores import FAISS
# Customize retriever to return top 3 documents
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
results = retriever.get_relevant_documents("Explain embeddings.")
for doc in results:
print(doc.page_content) output
Embeddings convert text into numerical vectors that capture semantic meaning. They enable similarity search in vector databases.
Troubleshooting
- If you get an error about missing FAISS, ensure you installed
faiss-cpuor the appropriate FAISS package for your platform. - If no documents are returned, verify your documents loaded correctly and embeddings are generated without errors.
- Check your OpenAI API key is set in
os.environ["OPENAI_API_KEY"]and has sufficient quota.
Key Takeaways
- Use
OpenAIEmbeddingswith a vector store likeFAISSto enable document retrieval in LangChain. - Call
as_retriever()on your vector store to get a retriever object for similarity search. - Customize retriever parameters like
search_typeandsearch_kwargsto control retrieval behavior. - Always load documents properly and verify your OpenAI API key is set in environment variables.
- Troubleshoot FAISS installation and document loading if retrieval returns empty results or errors.