How to intermediate · 3 min read

How to fine-tune model for classification

Quick answer

To fine-tune a model for classification, prepare a labeled dataset in JSONL format with messages and labels, upload it using client.files.create, then create a fine-tuning job with client.fine_tuning.jobs.create specifying the base model. After training, use the fine-tuned model for classification by sending inputs as chat messages and interpreting the output labels.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

bash

pip install openai

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

Prepare your classification dataset in JSONL format where each line contains a JSON object with messages (including user input) and the expected label as the assistant's response. Upload the file, create a fine-tuning job, and then use the fine-tuned model for inference.

python

import os
import json
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Step 1: Prepare training data file (classification format)
# Example line in training.jsonl:
# {"messages": [{"role": "user", "content": "Is this email spam?"}], "label": "spam"}

# Step 2: Upload training file
with open("training.jsonl", "rb") as f:
    training_file = client.files.create(file=f, purpose="fine-tune")

print(f"Uploaded training file ID: {training_file.id}")

# Step 3: Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file=training_file.id,
    model="gpt-4o-mini-2024-07-18"
)

print(f"Fine-tuning job ID: {job.id}")

# Step 4: Poll job status (simplified example)
import time
while True:
    status = client.fine_tuning.jobs.retrieve(job.id)
    print(f"Status: {status.status}")
    if status.status in ["succeeded", "failed"]:
        break
    time.sleep(30)

# Step 5: Use fine-tuned model for classification
if status.status == "succeeded":
    fine_tuned_model = status.fine_tuned_model
    messages = [{"role": "user", "content": "Is this email spam?"}]
    response = client.chat.completions.create(
        model=fine_tuned_model,
        messages=messages
    )
    print("Classification result:", response.choices[0].message.content)

output

Uploaded training file ID: file-abc123xyz
Fine-tuning job ID: job-xyz789abc
Status: running
Status: running
Status: succeeded
Classification result: spam

Common variations

Use async calls with asyncio and await for non-blocking fine-tuning job polling.
Change base model to gpt-4o or gpt-4o-mini depending on your accuracy and cost needs.
For multi-class classification, format labels as distinct strings and ensure training data covers all classes.

python

import asyncio

async def poll_job(job_id):
    while True:
        status = client.fine_tuning.jobs.retrieve(job_id)
        print(f"Status: {status.status}")
        if status.status in ["succeeded", "failed"]:
            return status
        await asyncio.sleep(30)

# Usage:
# asyncio.run(poll_job(job.id))

output

Status: running
Status: running
Status: succeeded

Troubleshooting

If you see Invalid file format, ensure your JSONL lines have correct messages and label keys.
If fine-tuning job fails, check your dataset size and format; classification fine-tuning requires sufficient labeled examples.
Use client.fine_tuning.jobs.retrieve(job.id) to get detailed error messages.

✅

Key Takeaways

Prepare classification data as JSONL with user messages and label responses.
Upload data and create fine-tuning jobs using client.files.create and client.fine_tuning.jobs.create.
Poll job status until completion before using the fine-tuned model for inference.
Use async polling or different base models to optimize workflow and cost.
Check error messages carefully for data format issues or insufficient training data.

Verified 2026-04 · gpt-4o-mini-2024-07-18, gpt-4o

Verify ↗