How to build document QA chatbot
Quick answer
Build a document QA chatbot by embedding your documents with
OpenAI embeddings, storing them in a vector store like FAISS, and querying with a chat model such as gpt-4o to answer user questions based on retrieved document context. Use OpenAI SDK for embedding and chat completions to integrate retrieval-augmented generation.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai faiss-cpu numpy
Setup
Install required packages and set your OpenAI API key as an environment variable.
- Install packages:
pip install openai faiss-cpu numpy - Set environment variable in your shell:
export OPENAI_API_KEY='your_api_key'
pip install openai faiss-cpu numpy Step by step
This example loads documents, creates embeddings with text-embedding-3-small, indexes them with FAISS, and uses gpt-4o to answer questions by retrieving relevant document chunks.
import os
import numpy as np
import faiss
from openai import OpenAI
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample documents
documents = [
"Python is a popular programming language.",
"OpenAI provides powerful AI models.",
"FAISS is a library for efficient similarity search.",
"Embeddings convert text into vectors for search."
]
# Create embeddings for documents
embeddings = []
for doc in documents:
response = client.embeddings.create(model="text-embedding-3-small", input=doc)
embeddings.append(response.data[0].embedding)
# Convert to numpy array
embedding_matrix = np.array(embeddings).astype("float32")
# Build FAISS index
index = faiss.IndexFlatL2(embedding_matrix.shape[1])
index.add(embedding_matrix)
# Function to query chatbot
def query_qa_bot(question: str):
# Embed the question
q_embedding_resp = client.embeddings.create(model="text-embedding-3-small", input=question)
q_embedding = np.array(q_embedding_resp.data[0].embedding).astype("float32")
# Search for top 2 relevant docs
D, I = index.search(np.array([q_embedding]), k=2)
# Retrieve relevant docs
context = "\n".join([documents[i] for i in I[0]])
# Prepare chat messages with context
messages = [
{"role": "system", "content": "You are a helpful assistant answering questions based on the provided documents."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
]
# Generate answer
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
# Example usage
if __name__ == "__main__":
question = "What is FAISS used for?"
answer = query_qa_bot(question)
print("Q:", question)
print("A:", answer) output
Q: What is FAISS used for? A: FAISS is a library for efficient similarity search, commonly used to find similar vectors such as embeddings in large datasets.
Common variations
You can enhance your document QA chatbot by:
- Using async calls with
asyncioandOpenAIasync client. - Streaming chat responses for real-time user feedback.
- Switching to other models like
claude-3-5-haiku-20241022orgemini-2.0-flashfor different capabilities. - Using other vector stores like
ChromaorWeaviatefor scalable document retrieval.
Troubleshooting
- If you get authentication errors, verify your
OPENAI_API_KEYis set correctly in your environment. - If embeddings are slow, batch your inputs or use smaller models like
text-embedding-3-small. - If answers are off-topic, ensure your context retrieval returns relevant documents and increase
kin FAISS search. - For large documents, chunk them into smaller pieces before embedding.
Key Takeaways
- Use
OpenAI embeddingsand a vector store likeFAISSto enable document retrieval for QA chatbots. - Combine retrieved document context with chat models like
gpt-4ofor accurate, context-aware answers. - Chunk large documents and tune retrieval parameters to improve relevance and response quality.