How to build a question answering system over documents
Quick answer
Build a question answering system over documents by first converting documents into vector embeddings using models like
OpenAIEmbeddings, then storing them in a vector database such as FAISS. Query the system by embedding the question, retrieving relevant document chunks, and using an LLM like gpt-4o to generate answers based on the retrieved context.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai langchain langchain_community faiss-cpu
Setup
Install required Python packages and set your OpenAI API key as an environment variable.
pip install openai langchain langchain_community faiss-cpu Step by step
This example loads text documents, creates embeddings, stores them in a FAISS vector store, and queries with an LLM to answer questions based on document content.
import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
from openai import OpenAI
# Load documents
loader = TextLoader("./docs/sample.txt")
docs = loader.load()
# Create embeddings
embeddings = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
# Build FAISS vector store
vectorstore = FAISS.from_documents(docs, embeddings)
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Query function
def answer_question(question: str) -> str:
# Embed question and retrieve relevant docs
relevant_docs = vectorstore.similarity_search(question, k=3)
context = "\n\n".join([doc.page_content for doc in relevant_docs])
# Prepare prompt
prompt_template = """
You are a helpful assistant. Use the following context to answer the question.
Context:\n{context}\n\nQuestion: {question}\nAnswer:"""
prompt = prompt_template.format(context=context, question=question)
# Call LLM
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
# Example usage
question = "What is the main topic of the documents?"
answer = answer_question(question)
print("Q:", question)
print("A:", answer) output
Q: What is the main topic of the documents? A: [LLM-generated answer based on document content]
Common variations
- Use
claude-3-5-sonnet-20241022from Anthropic instead of OpenAI for better coding or reasoning. - Implement async calls for higher throughput in production.
- Use chunking and overlap strategies to improve retrieval quality on large documents.
- Swap
FAISSwith other vector stores likeChromaorWeaviatedepending on scale and features.
Troubleshooting
- If retrieval returns irrelevant results, increase
kinsimilarity_searchor improve document chunking. - If API calls fail, verify your
OPENAI_API_KEYenvironment variable is set correctly. - For slow responses, consider caching embeddings or using smaller models like
gpt-4o-mini.
Key Takeaways
- Convert documents into vector embeddings to enable semantic search.
- Use a vector database like FAISS to efficiently retrieve relevant document chunks.
- Feed retrieved context plus the question to an LLM for accurate answers.
- Experiment with different models and vector stores to optimize performance.
- Proper chunking and overlap improve retrieval relevance and answer quality.