How to beginner · 4 min read

How to use retriever in LangChain

Q: How to use retriever in LangChain

Use a retriever in LangChain by first creating embeddings with OpenAIEmbeddings, loading documents into a vector store like FAISS, and then calling retriever on the vector store to perform similarity search. This enables efficient retrieval of relevant documents for downstream tasks.

Quick answer

Use a retriever in LangChain by first creating embeddings with OpenAIEmbeddings, loading documents into a vector store like FAISS, and then calling retriever on the vector store to perform similarity search. This enables efficient retrieval of relevant documents for downstream tasks.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install langchain_openai langchain_community faiss-cpu

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install LangChain OpenAI and community packages, plus FAISS for vector search:

bash

pip install langchain_openai langchain_community faiss-cpu

Step by step

This example loads text documents, creates embeddings with OpenAIEmbeddings, indexes them in a FAISS vector store, and uses the retriever to find relevant documents for a query.

python

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader

# Set your OpenAI API key in environment variable before running
# export OPENAI_API_KEY='your_api_key'

# Load documents from local text files
loader = TextLoader("example_docs/doc1.txt")
docs = loader.load()

# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])

# Create FAISS vector store from documents
vectorstore = FAISS.from_documents(docs, embeddings)

# Get retriever from vector store
retriever = vectorstore.as_retriever()

# Query the retriever
query = "What is LangChain?"
results = retriever.get_relevant_documents(query)

# Print retrieved document contents
for i, doc in enumerate(results):
    print(f"Document {i+1}: {doc.page_content}\n")

output

Document 1: LangChain is a framework for developing applications powered by language models.

Document 2: LangChain enables chaining of LLM calls with other components like retrievers and memory.

Common variations

You can customize the retriever behavior by setting parameters like search_type (e.g., 'similarity', 'mmr'), search_kwargs (e.g., {"k": 5} for number of results), or use different vector stores such as Chroma. Async retrieval is also supported in some vector stores.

python

from langchain_community.vectorstores import FAISS

# Customize retriever to return top 3 documents
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})

results = retriever.get_relevant_documents("Explain embeddings.")
for doc in results:
    print(doc.page_content)

output

Embeddings convert text into numerical vectors that capture semantic meaning.
They enable similarity search in vector databases.

Troubleshooting

If you get an error about missing FAISS, ensure you installed faiss-cpu or the appropriate FAISS package for your platform.
If no documents are returned, verify your documents loaded correctly and embeddings are generated without errors.
Check your OpenAI API key is set in os.environ["OPENAI_API_KEY"] and has sufficient quota.

✅

Key Takeaways

Use OpenAIEmbeddings with a vector store like FAISS to enable document retrieval in LangChain.
Call as_retriever() on your vector store to get a retriever object for similarity search.
Customize retriever parameters like search_type and search_kwargs to control retrieval behavior.
Always load documents properly and verify your OpenAI API key is set in environment variables.
Troubleshoot FAISS installation and document loading if retrieval returns empty results or errors.

Verified 2026-04 · gpt-4o, claude-3-5-sonnet-20241022

Verify ↗