How to monitor fine-tuning job
Quick answer
Use the OpenAI Python SDK's
client.fine_tuning.jobs.retrieve() method with the fine-tuning job ID to check the job status and details. Poll this endpoint periodically until the job completes, then use the fine_tuned_model field to run inference with the fine-tuned model.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the latest OpenAI Python SDK and set your API key as an environment variable.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows how to upload a training file, create a fine-tuning job, and poll the job status until completion.
import os
import time
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Upload training file
training_file = client.files.create(
file=open("training.jsonl", "rb"),
purpose="fine-tune"
)
print(f"Uploaded file ID: {training_file.id}")
# Create fine-tuning job
job = client.fine_tuning.jobs.create(
training_file=training_file.id,
model="gpt-4o-mini-2024-07-18"
)
print(f"Created fine-tuning job ID: {job.id}")
# Poll job status until done
while True:
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Job status: {status.status}")
if status.status in ["succeeded", "failed"]:
break
time.sleep(10) # wait 10 seconds before next check
if status.status == "succeeded":
print(f"Fine-tuned model: {status.fine_tuned_model}")
else:
print("Fine-tuning job failed.") output
Uploaded file ID: file-abc123xyz Created fine-tuning job ID: ftjob-xyz789abc Job status: pending Job status: running Job status: running Job status: succeeded Fine-tuned model: gpt-4o-mini-ft-2026-04-01-xyz
Common variations
You can monitor fine-tuning jobs asynchronously or use different base models. The polling interval can be adjusted based on your needs.
import asyncio
import os
from openai import OpenAI
async def monitor_fine_tuning_job(job_id: str):
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
while True:
status = client.fine_tuning.jobs.retrieve(job_id)
print(f"Async job status: {status.status}")
if status.status in ["succeeded", "failed"]:
return status
await asyncio.sleep(10)
# Usage example
# asyncio.run(monitor_fine_tuning_job("ftjob-xyz789abc")) output
Async job status: pending Async job status: running Async job status: succeeded
Troubleshooting
- If you receive a 404 error when retrieving the job, verify the job ID is correct.
- If the job status stays in "pending" for too long, check your training file format and API usage limits.
- Use
client.files.list()to confirm your training file upload.
Key Takeaways
- Use
client.fine_tuning.jobs.retrieve()with the job ID to monitor fine-tuning status. - Poll the job status periodically until it reaches "succeeded" or "failed".
- After success, use the
fine_tuned_modelfield to run inference with the fine-tuned model.