How to build legal RAG system
Quick answer
Build a legal RAG system by combining a
vector database for embedding legal documents with a large language model (LLM) like gpt-4o to generate answers based on retrieved context. Use OpenAI embeddings to convert legal texts into vectors, then query these vectors to find relevant documents and feed them as context to the LLM for precise legal responses.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install faiss-cpu or chromadb
Setup
Install required Python packages and set your environment variable for the OpenAI API key.
- Use
faiss-cpuorchromadbfor vector search. - Set
OPENAI_API_KEYin your environment.
pip install openai faiss-cpu output
Collecting openai Collecting faiss-cpu Successfully installed openai-1.x faiss-cpu-1.x
Step by step
This example shows how to embed legal documents, store them in a vector index, query relevant documents, and generate answers using gpt-4o.
import os
from openai import OpenAI
import faiss
import numpy as np
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample legal documents
legal_docs = [
"Section 1: Contract terms and obligations.",
"Section 2: Intellectual property rights.",
"Section 3: Liability and indemnification.",
"Section 4: Termination clauses and conditions."
]
# Step 1: Create embeddings for legal docs
response = client.embeddings.create(
model="text-embedding-3-small",
input=legal_docs
)
embeddings = [data.embedding for data in response.data]
# Step 2: Build FAISS index
dimension = len(embeddings[0])
index = faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings).astype('float32'))
# Step 3: Query embedding for user question
query = "What are the termination conditions in the contract?"
query_response = client.embeddings.create(
model="text-embedding-3-small",
input=[query]
)
query_embedding = np.array(query_response.data[0].embedding).astype('float32')
# Step 4: Search top 2 relevant docs
k = 2
D, I = index.search(np.array([query_embedding]), k)
relevant_docs = [legal_docs[i] for i in I[0]]
# Step 5: Generate answer with context
context = "\n".join(relevant_docs)
prompt = f"You are a legal assistant. Use the following context to answer the question.\nContext:\n{context}\nQuestion: {query}\nAnswer:"
chat_response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
answer = chat_response.choices[0].message.content
print("Answer:", answer) output
Answer: The termination conditions in the contract include the clauses outlined in Section 4, which specify the conditions under which the contract may be terminated by either party.
Common variations
You can enhance your legal RAG system by:
- Using
chromadbinstead of FAISS for easier setup and persistence. - Switching to
gpt-4o-minifor cost-effective inference. - Implementing async calls with
asynciofor scalable querying. - Adding document chunking and metadata filtering for more precise retrieval.
import asyncio
from openai import OpenAI
async def async_legal_rag():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
query = "Explain liability clauses."
embed_resp = await client.embeddings.acreate(model="text-embedding-3-small", input=[query])
query_emb = embed_resp.data[0].embedding
# Assume vector search async method here
# Then generate answer asynchronously
prompt = f"Use legal docs context to answer: {query}"
chat_resp = await client.chat.completions.acreate(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
print(chat_resp.choices[0].message.content)
asyncio.run(async_legal_rag()) output
Liability clauses define the responsibilities and limits of each party in case of damages or losses, protecting parties from excessive claims.
Troubleshooting
- If embeddings return errors, verify your
OPENAI_API_KEYand model name. - If vector search returns irrelevant results, increase the number of retrieved documents or improve document chunking.
- If the LLM output is vague, provide clearer context or use system prompts to instruct the model.
Key Takeaways
- Use OpenAI embeddings to vectorize legal documents for semantic search.
- Combine vector search results as context to guide the LLM for accurate legal answers.
- Choose models like
gpt-4oorgpt-4o-minibalancing cost and performance. - Implement async and chunking for scalable, precise legal RAG systems.
- Validate API keys and model names to avoid common errors.