How to build an AI-powered FAQ system
Quick answer
Build an AI-powered FAQ system by embedding your FAQ documents using vector embeddings and querying them with a large language model like
gpt-4o. Use a vector store to find relevant FAQ entries and then generate precise answers with client.chat.completions.create from the OpenAI API.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install faiss-cpu or chromadb (for vector store)
Setup
Install the required Python packages and set your OpenAI API key as an environment variable.
pip install openai faiss-cpu Step by step
This example shows how to embed FAQ entries, store them in a vector index, and query with gpt-4o to answer user questions.
import os
from openai import OpenAI
import faiss
import numpy as np
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample FAQ data
faq_data = [
{"question": "What is your return policy?", "answer": "You can return items within 30 days."},
{"question": "How do I track my order?", "answer": "Use the tracking link sent to your email."},
{"question": "Do you ship internationally?", "answer": "Yes, we ship worldwide with additional fees."}
]
# Step 1: Embed FAQ questions
faq_texts = [item["question"] for item in faq_data]
response = client.embeddings.create(
model="text-embedding-3-large",
input=faq_texts
)
embeddings = np.array([e.embedding for e in response.data]).astype('float32')
# Step 2: Build FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)
# Step 3: Query function
def query_faq(user_question):
# Embed user question
q_resp = client.embeddings.create(
model="text-embedding-3-large",
input=[user_question]
)
q_embedding = np.array(q_resp.data[0].embedding).astype('float32')
# Search nearest FAQ
D, I = index.search(np.array([q_embedding]), k=1)
matched_faq = faq_data[I[0][0]]
# Use LLM to generate answer based on matched FAQ
prompt = f"User question: {user_question}\nFAQ question: {matched_faq['question']}\nFAQ answer: {matched_faq['answer']}\nProvide a concise answer to the user question based on the FAQ answer."
chat_resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return chat_resp.choices[0].message.content
# Example usage
print(query_faq("Can I return a product after a month?")) output
You can return items within 30 days, so unfortunately returns after a month are not accepted.
Common variations
- Use
gpt-4o-minifor faster, cheaper responses with slightly less accuracy. - Implement async calls with
asyncioandawaitfor scalable web apps. - Swap FAISS with
chromadbor other vector stores for persistence and cloud integration.
Troubleshooting
- If embeddings return errors, verify your API key and model name.
- If FAISS index search returns no results, check that embeddings are correctly computed and indexed.
- For unexpected LLM answers, refine the prompt or increase
max_tokensinclient.chat.completions.create.
Key Takeaways
- Use vector embeddings to convert FAQ questions into searchable numeric form.
- Leverage a vector store like FAISS to find the most relevant FAQ entry for a user query.
- Use a large language model like
gpt-4oto generate natural, context-aware answers. - Keep your prompts clear and concise to improve answer quality.
- Test and tune embedding and retrieval parameters for best FAQ matching.