How to fine-tune LLM for coding
Quick answer
To fine-tune an LLM for coding, prepare a dataset of code examples and prompts, then use a fine-tuning API or framework like OpenAI's fine-tuning endpoint with a base model such as
gpt-4o. Fine-tuning adjusts the model weights to improve coding accuracy and style on your specific tasks.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the OpenAI Python SDK and set your API key as an environment variable to access the fine-tuning API.
pip install openai>=1.0
# In your shell:
# export OPENAI_API_KEY=os.environ["OPENAI_API_KEY"] Step by step
Prepare a JSONL file with prompt-completion pairs of code examples, then call the fine-tuning API to create a fine-tuned model. Use the fine-tuned model for coding completions.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Step 1: Prepare training data in JSONL format
# Example content of 'code_finetune_data.jsonl':
# {"prompt": "def add(a, b):", "completion": "\n return a + b"}
# {"prompt": "def factorial(n):", "completion": "\n if n == 0:\n return 1\n else:\n return n * factorial(n-1)"}
# Step 2: Upload training file
upload_response = client.files.create(
file=open("code_finetune_data.jsonl", "rb"),
purpose="fine-tune"
)
file_id = upload_response.id
# Step 3: Create fine-tune job
fine_tune_response = client.fine_tunes.create(
training_file=file_id,
model="gpt-4o"
)
print("Fine-tune job created:", fine_tune_response.id)
# Step 4: After fine-tuning completes, use the fine-tuned model
# Replace 'fine_tuned_model_name' with actual model name from fine-tune job
response = client.chat.completions.create(
model="fine_tuned_model_name",
messages=[{"role": "user", "content": "def fibonacci(n):"}]
)
print(response.choices[0].message.content) output
Fine-tune job created: ft-A1B2C3D4E5
\n # Output from fine-tuned model might be:
# "\n if n <= 1:\n return n\n else:\n return fibonacci(n-1) + fibonacci(n-2)" Common variations
You can fine-tune asynchronously by polling the fine-tune job status, use different base models like gpt-4o-mini for faster results, or use streaming completions for real-time code generation.
import time
# Poll fine-tune job status
job_id = fine_tune_response.id
while True:
status = client.fine_tunes.get(id=job_id).status
print(f"Fine-tune status: {status}")
if status in ["succeeded", "failed"]:
break
time.sleep(30)
# Use streaming for code generation
response = client.chat.completions.create(
model="fine_tuned_model_name",
messages=[{"role": "user", "content": "def reverse_string(s):"}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.get('content', ''), end='') output
Fine-tune status: running
Fine-tune status: succeeded
def reverse_string(s):
return s[::-1] Troubleshooting
- If you see errors uploading files, verify the JSONL format and file size limits.
- If fine-tuning fails, check your training data quality and ensure prompts and completions are well paired.
- Timeouts during polling can be handled by increasing sleep intervals.
Key Takeaways
- Fine-tune LLMs by preparing prompt-completion JSONL datasets tailored to coding tasks.
- Use OpenAI's fine-tuning API with models like
gpt-4ofor best coding results. - Poll fine-tune job status asynchronously and use streaming completions for interactive coding.
- Validate training data format and quality to avoid fine-tuning errors.
- Fine-tuning customizes model behavior, improving accuracy and style on your coding domain.