ValidJsonValidationError
guardrails.validators.valid_json.ValidJsonValidationError
Stack trace
guardrails.validators.valid_json.ValidJsonValidationError: Value '[LLM_OUTPUT]' is not valid JSON. Error: Expecting value: line 1 column 1 (char 0) Failed validation on step with id: validate_json Guard call failed with error: Validation failed for field with code: valid_json
Why it happens
The ValidJson validator enforces that LLM output is parseable as valid JSON. When the LLM returns markdown code fences (```json ... ```), preamble text before JSON, or malformed JSON with extra commas/quotes, the validator rejects it. This is especially common with models not instruction-tuned for strict JSON output, or when the prompt doesn't explicitly forbid markdown formatting.
Detection
Add logging before guard execution to inspect the raw LLM output and catch formatting issues early. Use guard.validate() in a try/except block to catch ValidJsonValidationError and log the raw response for analysis.
Causes & fixes
LLM wrapped JSON in markdown code fences (```json ... ```)
Add explicit instruction to prompt: 'Return ONLY raw JSON, no markdown fences, no explanation' or use Guardrails' custom prompt injection that auto-strips code fences
LLM output has preamble text before the JSON (e.g., 'Here is the JSON:' followed by actual JSON)
Use Guard's string_before_json_extractor or add post-processing: `json.loads(re.search(r'\{.*\}', response, re.DOTALL).group())`
JSON is malformed (missing quotes, trailing commas, unescaped newlines in strings)
Upgrade to guardrails-ai>=0.5.0 and use ValidJson with auto_repair=True, or switch to a more capable model like gpt-4o-mini or claude-3-5-haiku-20241022
Using older Guardrails Guard syntax that doesn't chain validators properly
Update to guardrails-ai>=0.5.0 and use Guard().use(ValidJson, on_fail='exception') pattern instead of legacy Guard constructor
Code: broken vs fixed
import os
import json
from guardrails import Guard
from guardrails.hub import ValidJson
from openai import OpenAI
# BROKEN: No retry logic, no on_fail handler for JSON validation
client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])
guard = Guard().use(ValidJson)
response = client.chat.completions.create(
model='gpt-3.5-turbo', # Weak model for JSON tasks
messages=[{'role': 'user', 'content': 'Return user info as JSON'}]
)
# This line FAILS with ValidJsonValidationError if model returns markdown or preamble
validated = guard.validate(response.choices[0].message.content)
print(validated) import os
import json
from guardrails import Guard
from guardrails.hub import ValidJson
from openai import OpenAI
# FIXED: Use instruction-tuned model, explicit prompt, and auto-repair
client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])
guard = Guard().use(ValidJson, on_fail='fix_str') # Auto-extract JSON from malformed output
response = client.chat.completions.create(
model='gpt-4o-mini', # CHANGED: Better at following JSON instructions
messages=[{
'role': 'user',
'content': 'Return user info as valid JSON only. No markdown, no explanation. {"name": "...", "email": "..."}'
}]
)
# Now handles markdown fences, preamble text, and minor JSON errors gracefully
try:
validated = guard.validate(response.choices[0].message.content)
print(f'Valid JSON: {validated.validated_output}')
except Exception as e:
print(f'Validation failed even with auto-repair: {e}') Workaround
Extract JSON using regex before validation: `import re; json_str = re.search(r'\{.*\}', response_text, re.DOTALL).group(); validated = guard.validate(json_str)` to strip markdown and preamble manually, bypassing the formatter sensitivity.
Prevention
Use OpenAI's response_format={'type': 'json_object'} at the API level to guarantee valid JSON output before it reaches Guardrails, or adopt Anthropic's tool use protocol which returns structured data natively instead of relying on LLM text formatting.