How to build study assistant with RAG
Quick answer
Build a study assistant with
RAG by combining a vector database for document retrieval with a large language model (LLM) like gpt-4o. First, embed study materials using OpenAI embeddings, then query relevant context to augment LLM responses for accurate, context-aware answers.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install faiss-cpupip install numpy
Setup
Install required Python packages and set your OPENAI_API_KEY environment variable.
- Use
openaiSDK v1+ for LLM and embeddings. - Use
faiss-cpufor vector similarity search.
pip install openai faiss-cpu numpy output
Collecting openai... Collecting faiss-cpu... Collecting numpy... Successfully installed openai faiss-cpu numpy
Step by step
This example shows how to embed study documents, build a vector index with faiss, and query with gpt-4o using retrieved context for a study assistant.
import os
import numpy as np
import faiss
from openai import OpenAI
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample study documents
documents = [
"Photosynthesis is the process by which green plants convert sunlight into energy.",
"The mitochondria is the powerhouse of the cell.",
"Newton's second law states that Force equals mass times acceleration.",
"The capital of France is Paris."
]
# Step 1: Embed documents
response = client.embeddings.create(
model="text-embedding-3-small",
input=documents
)
embeddings = np.array([data.embedding for data in response.data]).astype('float32')
# Step 2: Build FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)
# Step 3: Query function
def query_study_assistant(question, k=2):
# Embed question
q_resp = client.embeddings.create(model="text-embedding-3-small", input=[question])
q_emb = np.array(q_resp.data[0].embedding).astype('float32').reshape(1, -1)
# Search top k relevant docs
distances, indices = index.search(q_emb, k)
context = "\n".join([documents[i] for i in indices[0]])
# Compose prompt with context
prompt = f"Use the following study notes to answer the question.\n\n{context}\n\nQuestion: {question}\nAnswer:"
# Call LLM
chat_resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return chat_resp.choices[0].message.content
# Example usage
question = "What is the role of mitochondria in cells?"
answer = query_study_assistant(question)
print("Q:", question)
print("A:", answer) output
Q: What is the role of mitochondria in cells? A: The mitochondria is the powerhouse of the cell, responsible for producing energy through cellular respiration.
Common variations
- Use
asynccalls withawait client.chat.completions.acreate()for concurrency. - Switch to smaller models like
gpt-4o-minifor cost efficiency. - Use other vector stores like
ChromaorFAISS GPUfor scalability. - Incorporate chunking and metadata for large documents.
Troubleshooting
- If embeddings are empty or errors occur, verify your
OPENAI_API_KEYis set correctly. - If FAISS index search returns no results, check embedding dimensions match.
- For incomplete answers, increase
kto retrieve more context. - Watch token limits in prompt; chunk large documents accordingly.
Key Takeaways
- Use OpenAI embeddings to convert study materials into vectors for retrieval.
- Combine vector search with LLM prompts to provide context-aware study answers.
- Adjust retrieval count and model size to balance cost and accuracy.
- Ensure environment variables and embedding dimensions are consistent.
- Chunk large documents to stay within token limits for LLM input.