API Intermediate medium · 6 min

Supported output formats

What you will learn
Control whether the API returns text, JSON, or raw tokens using the response_format parameter in chat completions.

Why this matters

Different downstream systems need different formats: some require structured JSON for parsing, others need raw text. Using the wrong format wastes tokens, breaks pipelines, or requires expensive post-processing to reformat the response.

Skip if: When you're building a chatbot with unstructured conversational output, plain text is sufficient and cheaper. Don't force JSON format just because it's available: use it only when your system requires structured parsing or the model must guarantee valid JSON syntax.

Explanation

What it does: The response_format parameter tells the OpenAI API how to structure the response. You can request plain text (default), JSON mode (which guarantees valid JSON output), or use structured outputs (schema-based responses for guaranteed field compliance).

How it works: When you set response_format={"type": "json_object"}, the API constrains its generation to output only valid JSON that parses correctly. When you use structured outputs with a schema, the model commits to returning JSON that conforms to your exact field definitions, types, and requirements. The API achieves this by modifying the tokenizer and sampling logic: it won't emit a token that would break JSON validity or violate the schema.

When to use it: Use JSON mode when you need guaranteed parseable output but don't need strict schema enforcement. Use structured outputs when you're extracting entities, building APIs, or feeding responses into downstream systems that demand exact field names and types. Plain text remains the default for cost and latency: use it for chat, creative writing, or analysis where you'll parse the text manually.

Request code

python
import json
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

# Example 1: JSON mode with plain JSON object
response_json = client.chat.completions.create(
    model='gpt-4o-2024-11-20',
    messages=[
        {'role': 'user', 'content': 'Extract the name, age, and city from: John is 28 years old and lives in Portland.'}
    ],
    response_format={'type': 'json_object'}
)

print('JSON Mode Response:')
result = json.loads(response_json.choices[0].message.content)
print(json.dumps(result, indent=2))

# Example 2: Structured output with explicit schema
schema = {
    'type': 'object',
    'properties': {
        'name': {'type': 'string'},
        'age': {'type': 'integer'},
        'city': {'type': 'string'}
    },
    'required': ['name', 'age', 'city'],
    'additionalProperties': False
}

response_structured = client.chat.completions.create(
    model='gpt-4o-2024-11-20',
    messages=[
        {'role': 'user', 'content': 'Extract: Alice is 32 and works in Seattle.'}
    ],
    response_format={'type': 'json_schema', 'json_schema': {'name': 'person', 'schema': schema, 'strict': True}}
)

print('\nStructured Output Response:')
result_structured = json.loads(response_structured.choices[0].message.content)
print(json.dumps(result_structured, indent=2))

# Example 3: Plain text (default)
response_text = client.chat.completions.create(
    model='gpt-4o-2024-11-20',
    messages=[
        {'role': 'user', 'content': 'Explain why the sky is blue in one sentence.'}
    ]
)

print('\nPlain Text Response:')
print(response_text.choices[0].message.content)

Authentication

Ensure OPENAI_API_KEY environment variable is set before instantiating the client. The SDK reads it at initialization time. No additional authentication is required for response_format: it's a standard parameter on all chat completion requests.

Response shape

FieldDescription
choices [object Object]
usage [object Object]
model string (model ID used)

Field guide

choices[0].message.content

The actual response text or JSON string. Always a string: parse with json.loads() if response_format is json_object or json_schema.

finish_reason

Indicates why generation stopped. 'stop' is normal. 'length' means max_tokens was hit. 'content_filter' means safety policy blocked output.

usage.completion_tokens

Developers often miss that JSON mode may use slightly more tokens than plain text for the same semantic content due to JSON syntax overhead.

Setup trap

When using structured outputs with strict: True, the schema must be valid JSON Schema Draft 2020-12. If your schema has type errors, typos in property names, or missing required fields, the API will reject the request with a 400 error before even running inference. Validate your schema object before sending it: use a JSON Schema validator locally first.

Cost

JSON mode and structured outputs incur standard completion token pricing but may use 5-15% more tokens than plain text for the same information due to JSON syntax constraints. If you're extracting simple key-value pairs, plain text with parsing may be cheaper. Measure both approaches on a representative sample.

Rate limits

No special rate limit behavior for response_format. All response formats count against your organization's tokens-per-minute limit equally.

Common gotcha

Forgetting to parse the JSON string. Even with response_format set to json_object, the API returns choices[0].message.content as a string, not a parsed object. You must call json.loads() to access fields. Attempting to access response.choices[0].message.content['name'] directly will fail with a TypeError.

Error recovery

json.JSONDecodeError
The model output is malformed JSON even with response_format json_object set. This is rare but can happen if the API hit a token limit mid-response. Retry with a higher max_tokens or shorter input. Alternatively, wrap json.loads() in a try-except and fall back to regex extraction.
400 Bad Request with 'schema error'
Your structured output schema is invalid. Common causes: mismatched property types, circular references, or using unsupported JSON Schema keywords. Validate the schema locally using jsonschema library before sending: `from jsonschema import Draft202012Validator; Draft202012Validator.check_schema(your_schema)`.
APIError with 'response format not supported'
The model version you're using doesn't support the response_format you requested. Structured outputs (json_schema) require gpt-4o, gpt-4o-mini, or newer. Older models like gpt-3.5-turbo only support json_object.
response.choices[0].message.content is empty string
The content filter blocked the response. Check your prompt for requests that might violate policy. If legitimate, try rephrasing or use a different model.

Experienced dev note

Structured outputs with schemas eliminate an entire class of downstream parsing bugs and model inconsistency issues. Instead of writing regex or custom parsing to extract fields from text, define your schema once and let the API guarantee the structure. The token overhead is worth the saved engineering time and reduced production incidents from malformed extraction. Also: json_schema with strict: True is cheaper (lower token count) than json_object because the schema constraint lets the model generate more efficiently: use it whenever you have a known structure.

Check your understanding

You're building an API that extracts invoice details (invoice_id, amount, date, vendor_name) from unstructured text and feeds them into a SQL database. Your team debates using json_object vs. structured outputs. What are the tradeoffs, and which should you choose for production? Why?

Show answer hint

json_object guarantees valid JSON but not schema compliance: the model might return extra fields, wrong types, or missing fields. Structured outputs with strict: True guarantees the exact fields and types you specify, preventing schema validation errors downstream. For production data pipelines, structured outputs save error handling code and eliminate silent data quality issues from type mismatches.

VERSION Structured outputs (json_schema) are available in gpt-4o-2024-11-20 and later. gpt-3.5-turbo and gpt-4-turbo do not support them: only json_object mode. Always specify a model version explicitly rather than relying on version aliases, as the API's support matrix changes quarterly.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.