How to Intermediate · 4 min read

How to fine-tune LLM for legal domain

Quick answer

Fine-tune a large language model (LLM) for the legal domain by preparing a domain-specific dataset in JSONL format with legal texts and instructions, then use the OpenAI API's fine-tuning endpoints to upload the data and create a fine-tuning job. Use a base model like gpt-4o-mini and monitor the job until completion to deploy the specialized legal model.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable for secure access.

bash

pip install openai>=1.0

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

Prepare your legal domain dataset in JSONL format where each entry contains messages with system, user, and assistant roles reflecting legal context. Upload the dataset, create a fine-tuning job with a base model like gpt-4o-mini, and poll the job status until the fine-tuned model is ready.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Step 1: Upload training file
with open("legal_finetune_data.jsonl", "rb") as f:
    training_file = client.files.create(file=f, purpose="fine-tune")

print(f"Uploaded file ID: {training_file.id}")

# Step 2: Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file=training_file.id,
    model="gpt-4o-mini"
)

print(f"Fine-tuning job ID: {job.id}")

# Step 3: Poll job status (simplified example)
import time

while True:
    status = client.fine_tuning.jobs.retrieve(job.id)
    print(f"Status: {status.status}")
    if status.status in ["succeeded", "failed"]:
        break
    time.sleep(30)

if status.status == "succeeded":
    print(f"Fine-tuned model: {status.fine_tuned_model}")
else:
    print("Fine-tuning failed.")

# Step 4: Use the fine-tuned model
response = client.chat.completions.create(
    model=status.fine_tuned_model,
    messages=[{"role": "user", "content": "Explain contract termination clauses."}]
)
print(response.choices[0].message.content)

output

Uploaded file ID: file-abc123xyz
Fine-tuning job ID: job-xyz789abc
Status: running
Status: running
Status: succeeded
Fine-tuned model: gpt-4o-mini:ft-legal-2026-04-01
[Legal domain explanation about contract termination clauses]

Common variations

Use async Python with asyncio and await for non-blocking fine-tuning job polling.
Choose different base models like gpt-4o for larger capacity or gpt-4o-mini for cost efficiency.
Incorporate validation and test datasets to evaluate fine-tuned model performance on legal tasks.

python

import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def poll_job(job_id):
    while True:
        status = client.fine_tuning.jobs.retrieve(job_id)
        print(f"Status: {status.status}")
        if status.status in ["succeeded", "failed"]:
            return status
        await asyncio.sleep(30)

async def main():
    # Assume training file uploaded and job created as before
    job = client.fine_tuning.jobs.create(training_file="file-abc123xyz", model="gpt-4o")
    print(f"Job ID: {job.id}")
    final_status = await poll_job(job.id)
    if final_status.status == "succeeded":
        print(f"Fine-tuned model: {final_status.fine_tuned_model}")
    else:
        print("Fine-tuning failed.")

asyncio.run(main())

output

Job ID: job-xyz789abc
Status: running
Status: running
Status: succeeded
Fine-tuned model: gpt-4o:ft-legal-2026-04-01

Troubleshooting

If you see file upload failed, check your file format is valid JSONL with correct messages structure.
If fine-tuning job fails, inspect logs via the API or dashboard for data quality issues or quota limits.
Ensure your API key has fine-tuning permissions and sufficient quota.
Use smaller datasets initially to validate the pipeline before scaling.

✅

Key Takeaways

Prepare a high-quality, domain-specific JSONL dataset with legal conversations for fine-tuning.
Use the OpenAI API's fine_tuning.jobs endpoints to upload data, create jobs, and monitor progress.
Select a base model balancing cost and capability, such as gpt-4o-mini for legal tasks.
Validate and test your fine-tuned model on legal queries before production use.
Troubleshoot by verifying data format, API permissions, and monitoring job status carefully.

Verified 2026-04 · gpt-4o-mini, gpt-4o

Verify ↗