How to start fine-tuning job with OpenAI API
Quick answer
To start a fine-tuning job with the OpenAI API, first prepare and upload your training data in JSONL format, then call
client.fine_tunes.create() with the uploaded file ID and desired base model. The API will return a fine-tune job ID to track progress and retrieve the fine-tuned model.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the OpenAI Python SDK and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 Step by step
Prepare your training data in JSONL format, upload it, then start the fine-tuning job by specifying the base model and training file ID.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Step 1: Upload your training data file (JSONL format)
with open("training_data.jsonl", "rb") as f:
upload_response = client.files.create(
file=f,
purpose="fine-tune"
)
training_file_id = upload_response.id
print(f"Uploaded training file ID: {training_file_id}")
# Step 2: Create fine-tuning job
fine_tune_response = client.fine_tunes.create(
training_file=training_file_id,
model="gpt-4o"
)
print(f"Fine-tuning job started with ID: {fine_tune_response.id}")
# Optional: Check fine-tune job status
status_response = client.fine_tunes.get(id=fine_tune_response.id)
print(f"Status: {status_response.status}") output
Uploaded training file ID: file-abc123xyz Fine-tuning job started with ID: ft-xyz789abc Status: pending
Common variations
- Use different base models like
gpt-4o-miniorgpt-4odepending on your needs. - Use asynchronous calls or polling to track fine-tune job progress.
- Include validation files by uploading a
validation_fileparameter.
Troubleshooting
- If you get an error about file format, ensure your training data is valid JSONL with
promptandcompletionfields. - If the fine-tune job fails, check the
statusandeventsviaclient.fine_tunes.get()for detailed error messages. - Ensure your API key has permissions for fine-tuning.
Key Takeaways
- Upload your training data as a JSONL file with
promptandcompletionfields before starting fine-tuning. - Use
client.fine_tunes.create()with the training file ID and base model to start the fine-tuning job. - Track fine-tuning status with
client.fine_tunes.get()and handle errors by inspecting job events. - Choose the base model carefully based on your use case and resource constraints.
- Always keep your API key secure in environment variables and never hardcode it.