How to Intermediate · 3 min read

How to build a document QA system with OpenAI

Quick answer

Use OpenAI embeddings to vectorize document chunks and store them in a vector database like FAISS. Then query the vector store with user questions and pass retrieved context to a gpt-4o chat completion for accurate document-based answers.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai faiss-cpu numpy

Setup

Install required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai faiss-cpu numpy
Set environment variable in your shell: export OPENAI_API_KEY='your_api_key_here'

bash

pip install openai faiss-cpu numpy

Step by step

This example loads a text document, splits it into chunks, creates embeddings with OpenAI's o1-mini model, indexes them with FAISS, and answers questions by retrieving relevant chunks and querying gpt-4o.

python

import os
import numpy as np
import faiss
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample document text
document_text = """OpenAI develops powerful AI models. Document question answering allows users to query large texts efficiently. This example shows how to build a QA system."""

# Split document into chunks (simple split by sentences here)
chunks = [chunk.strip() for chunk in document_text.split('.') if chunk.strip()]

# Create embeddings for each chunk using OpenAI's embedding model
embeddings = []
for chunk in chunks:
    response = client.embeddings.create(
        model="o1-mini",
        input=chunk
    )
    embeddings.append(response.data[0].embedding)

embeddings = np.array(embeddings).astype('float32')

# Build FAISS index
dimension = len(embeddings[0])
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

# Function to answer questions

def answer_question(question):
    # Embed the question
    q_embedding_resp = client.embeddings.create(
        model="o1-mini",
        input=question
    )
    q_embedding = np.array(q_embedding_resp.data[0].embedding).astype('float32')
    q_embedding = np.expand_dims(q_embedding, axis=0)

    # Search FAISS for top 2 relevant chunks
    D, I = index.search(q_embedding, k=2)
    relevant_chunks = [chunks[i] for i in I[0]]

    # Prepare prompt with context
    context = "\n".join(relevant_chunks)
    messages = [
        {"role": "system", "content": "You are a helpful assistant answering questions based on provided document context."},
        {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
    ]

    # Query GPT-4o for answer
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return response.choices[0].message.content

# Example usage
question = "What does OpenAI develop?"
answer = answer_question(question)
print("Q:", question)
print("A:", answer)

output

Q: What does OpenAI develop?
A: OpenAI develops powerful AI models.

Common variations

You can enhance this system by:

Using async calls with asyncio and OpenAI's async client.
Streaming partial answers from gpt-4o for faster UX.
Switching embedding models (e.g., o1 for higher quality).
Using other vector stores like Chroma or Pinecone for scalability.

Troubleshooting

If you get empty or irrelevant answers:

Check your document chunking strategy; too large or too small chunks reduce retrieval quality.
Verify your API key is set correctly in os.environ["OPENAI_API_KEY"].
Ensure you use the correct model names o1-mini for embeddings and gpt-4o for chat completions.
Monitor API usage limits and errors in your console or logs.

✅

Key Takeaways

Use OpenAI embeddings to vectorize document chunks for efficient retrieval.
Combine FAISS vector search with gpt-4o chat completions for accurate answers.
Always split documents into meaningful chunks to improve context relevance.
Keep API keys secure and use environment variables for all calls.
Experiment with different embedding models and vector stores for scalability.

Verified 2026-04 · gpt-4o, o1-mini

Verify ↗