How to Intermediate · 3 min read

How to build document QA chatbot

Quick answer

Build a document QA chatbot by embedding your documents with OpenAI embeddings, storing them in a vector store like FAISS, and querying with a chat model such as gpt-4o to answer user questions based on retrieved document context. Use OpenAI SDK for embedding and chat completions to integrate retrieval-augmented generation.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai faiss-cpu numpy

Setup

Install required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai faiss-cpu numpy
Set environment variable in your shell: export OPENAI_API_KEY='your_api_key'

bash

pip install openai faiss-cpu numpy

Step by step

This example loads documents, creates embeddings with text-embedding-3-small, indexes them with FAISS, and uses gpt-4o to answer questions by retrieving relevant document chunks.

python

import os
import numpy as np
import faiss
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents
documents = [
    "Python is a popular programming language.",
    "OpenAI provides powerful AI models.",
    "FAISS is a library for efficient similarity search.",
    "Embeddings convert text into vectors for search."
]

# Create embeddings for documents
embeddings = []
for doc in documents:
    response = client.embeddings.create(model="text-embedding-3-small", input=doc)
    embeddings.append(response.data[0].embedding)

# Convert to numpy array
embedding_matrix = np.array(embeddings).astype("float32")

# Build FAISS index
index = faiss.IndexFlatL2(embedding_matrix.shape[1])
index.add(embedding_matrix)

# Function to query chatbot

def query_qa_bot(question: str):
    # Embed the question
    q_embedding_resp = client.embeddings.create(model="text-embedding-3-small", input=question)
    q_embedding = np.array(q_embedding_resp.data[0].embedding).astype("float32")

    # Search for top 2 relevant docs
    D, I = index.search(np.array([q_embedding]), k=2)

    # Retrieve relevant docs
    context = "\n".join([documents[i] for i in I[0]])

    # Prepare chat messages with context
    messages = [
        {"role": "system", "content": "You are a helpful assistant answering questions based on the provided documents."},
        {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
    ]

    # Generate answer
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return response.choices[0].message.content

# Example usage
if __name__ == "__main__":
    question = "What is FAISS used for?"
    answer = query_qa_bot(question)
    print("Q:", question)
    print("A:", answer)

output

Q: What is FAISS used for?
A: FAISS is a library for efficient similarity search, commonly used to find similar vectors such as embeddings in large datasets.

Common variations

You can enhance your document QA chatbot by:

Using async calls with asyncio and OpenAI async client.
Streaming chat responses for real-time user feedback.
Switching to other models like claude-3-5-haiku-20241022 or gemini-2.0-flash for different capabilities.
Using other vector stores like Chroma or Weaviate for scalable document retrieval.

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY is set correctly in your environment.
If embeddings are slow, batch your inputs or use smaller models like text-embedding-3-small.
If answers are off-topic, ensure your context retrieval returns relevant documents and increase k in FAISS search.
For large documents, chunk them into smaller pieces before embedding.

✅

Key Takeaways

Use OpenAI embeddings and a vector store like FAISS to enable document retrieval for QA chatbots.
Combine retrieved document context with chat models like gpt-4o for accurate, context-aware answers.
Chunk large documents and tune retrieval parameters to improve relevance and response quality.

Verified 2026-04 · gpt-4o, text-embedding-3-small, claude-3-5-haiku-20241022, gemini-2.0-flash

Verify ↗