Debug Fix intermediate · 3 min read

How to handle LLM refusing structured output

Quick answer
When an LLM refuses to produce structured output, ensure you explicitly instruct the model with clear formatting instructions in the prompt and validate the output format programmatically. Use system or user messages to enforce structure and parse responses safely.
ERROR TYPE model_behavior
⚡ QUICK FIX
Add explicit, detailed instructions in your prompt to enforce the structured output format and validate the response before processing.

Why this happens

LLMs sometimes refuse or fail to produce structured output because the prompt lacks clear, explicit instructions or the model interprets the request ambiguously. For example, asking for JSON or XML without specifying exact formatting rules can cause the model to respond in free text or partial structure. This behavior often triggers when the model tries to be helpful but is uncertain about the output format, leading to inconsistent or malformed responses.

Example broken code snippet using openai SDK:

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {"role": "user", "content": "Give me a JSON list of three fruits."}
    ]
)
print(response.choices[0].message.content)
output
{
  "fruits": ["apple", "banana", "cherry"]
}

# Sometimes the output is unstructured or incomplete, e.g. missing quotes or brackets.

The fix

Explicitly instruct the model to respond only with the structured format, including exact syntax and no extra text. Use a system message to set the format rules and a user message to request the data. Then parse and validate the output to catch errors early.

This approach works because it reduces ambiguity and guides the model to comply strictly with the output format.

python
from openai import OpenAI
import os
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

system_prompt = (
    "You are a JSON generator. Respond ONLY with a valid JSON array of strings, no extra text."
)
user_prompt = "List exactly three fruits as a JSON array of strings."

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]
)

output = response.choices[0].message.content

try:
    fruits = json.loads(output)
    print("Parsed fruits list:", fruits)
except json.JSONDecodeError:
    print("Failed to parse JSON output:", output)
output
Parsed fruits list: ['apple', 'banana', 'cherry']

Preventing it in production

  • Implement retries with prompt adjustments if the output is malformed or missing structure.
  • Validate the output format strictly (e.g., JSON schema validation) before downstream processing.
  • Use explicit system instructions to enforce output format and limit model creativity.
  • Consider fallback logic to re-prompt or use simpler output formats if the model repeatedly fails.

Key Takeaways

  • Always use explicit system instructions to enforce structured output format.
  • Validate and parse the model output before using it to catch format errors early.
  • Implement retries and fallback prompts to handle occasional model refusals or errors.
Verified 2026-04 · gpt-4.1-mini
Verify ↗