ValueError
openai.error.InvalidRequestError
Stack trace
openai.error.InvalidRequestError: Invalid training file format: each line must be a JSON object with 'prompt' and 'completion' keys.
File "fine_tune.py", line 42, in main
response = client.fine_tunes.create(training_file=training_file_id)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/fine_tune.py", line 50, in create
raise InvalidRequestError("Invalid training file format") Why it happens
OpenAI fine-tuning requires the training dataset to be a JSONL file where each line is a JSON object containing 'prompt' and 'completion' fields. If the file has missing fields, extra fields, or is not valid JSONL, the API rejects it with a format mismatch error.
Detection
Validate your training dataset file before upload by checking each line is valid JSON with exactly 'prompt' and 'completion' keys, and log the file upload response for errors.
Causes & fixes
Training file lines are not valid JSON objects
Ensure each line in the training file is a valid JSON object by validating with a JSON linter or parser before upload.
Missing required 'prompt' or 'completion' keys in dataset entries
Add both 'prompt' and 'completion' keys to every JSON object line in the training dataset exactly as required.
Extra or unexpected fields present in dataset JSON objects
Remove any fields other than 'prompt' and 'completion' from each JSON object line to comply with OpenAI fine-tuning schema.
File encoding or line ending issues causing parsing failures
Save the training file as UTF-8 encoded text with Unix-style LF line endings to avoid hidden parsing errors.
Code: broken vs fixed
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
training_file_id = "file-abc123"
# This will raise an error if the training file format is invalid
response = client.fine_tunes.create(training_file=training_file_id) # triggers ValueError
print(response) import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Validate training file format before upload
# Each line must be a JSON object with 'prompt' and 'completion' keys only
training_file_id = "file-abc123"
response = client.fine_tunes.create(training_file=training_file_id) # fixed: valid format
print(response) Workaround
Manually parse and validate each line of your training dataset file with a JSON parser before upload, fixing or removing invalid lines to avoid API rejection.
Prevention
Automate dataset validation scripts that enforce JSONL format with required keys before uploading training files, and use OpenAI's file upload API error messages to catch issues early.