How to Intermediate · 4 min read

How to fine-tune LLM for coding

Quick answer

To fine-tune an LLM for coding, prepare a dataset of code examples and prompts, then use a fine-tuning API or framework like OpenAI's fine-tuning endpoint with a base model such as gpt-4o. Fine-tuning adjusts the model weights to improve coding accuracy and style on your specific tasks.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the OpenAI Python SDK and set your API key as an environment variable to access the fine-tuning API.

bash

pip install openai>=1.0

# In your shell:
# export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"]

Step by step

Prepare a JSONL file with prompt-completion pairs of code examples, then call the fine-tuning API to create a fine-tuned model. Use the fine-tuned model for coding completions.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Step 1: Prepare training data in JSONL format
# Example content of 'code_finetune_data.jsonl':
# {"prompt": "def add(a, b):", "completion": "\n    return a + b"}
# {"prompt": "def factorial(n):", "completion": "\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)"}

# Step 2: Upload training file
upload_response = client.files.create(
    file=open("code_finetune_data.jsonl", "rb"),
    purpose="fine-tune"
)
file_id = upload_response.id

# Step 3: Create fine-tune job
fine_tune_response = client.fine_tunes.create(
    training_file=file_id,
    model="gpt-4o"
)

print("Fine-tune job created:", fine_tune_response.id)

# Step 4: After fine-tuning completes, use the fine-tuned model
# Replace 'fine_tuned_model_name' with actual model name from fine-tune job
response = client.chat.completions.create(
    model="fine_tuned_model_name",
    messages=[{"role": "user", "content": "def fibonacci(n):"}]
)
print(response.choices[0].message.content)

output

Fine-tune job created: ft-A1B2C3D4E5
\n    # Output from fine-tuned model might be:
    # "\n    if n <= 1:\n        return n\n    else:\n        return fibonacci(n-1) + fibonacci(n-2)"

Common variations

You can fine-tune asynchronously by polling the fine-tune job status, use different base models like gpt-4o-mini for faster results, or use streaming completions for real-time code generation.

python

import time

# Poll fine-tune job status
job_id = fine_tune_response.id
while True:
    status = client.fine_tunes.get(id=job_id).status
    print(f"Fine-tune status: {status}")
    if status in ["succeeded", "failed"]:
        break
    time.sleep(30)

# Use streaming for code generation
response = client.chat.completions.create(
    model="fine_tuned_model_name",
    messages=[{"role": "user", "content": "def reverse_string(s):"}],
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.get('content', ''), end='')

output

Fine-tune status: running
Fine-tune status: succeeded

def reverse_string(s):
    return s[::-1]

Troubleshooting

If you see errors uploading files, verify the JSONL format and file size limits.
If fine-tuning fails, check your training data quality and ensure prompts and completions are well paired.
Timeouts during polling can be handled by increasing sleep intervals.

✅

Key Takeaways

Fine-tune LLMs by preparing prompt-completion JSONL datasets tailored to coding tasks.
Use OpenAI's fine-tuning API with models like gpt-4o for best coding results.
Poll fine-tune job status asynchronously and use streaming completions for interactive coding.
Validate training data format and quality to avoid fine-tuning errors.
Fine-tuning customizes model behavior, improving accuracy and style on your coding domain.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗