Fix Pydantic validation error from LLM
Quick answer
A
Pydantic validation error from an LLM usually occurs when the model's JSON output does not match the expected schema. To fix this, ensure you parse the LLM's response.choices[0].message.content as JSON and then validate it against your Pydantic model explicitly. ERROR TYPE
code_error ⚡ QUICK FIX
Parse the LLM response content as JSON before passing it to your Pydantic model for validation.
Why this happens
When using structured outputs with an LLM, the model returns a JSON string inside response.choices[0].message.content. If you try to directly validate this string with a Pydantic model without parsing it first, you get a validation error because Pydantic expects a dict, not a raw JSON string.
Example error output:
pydantic.error_wrappers.ValidationError: 1 validation error for MyModel field required (type=value_error.missing)
from pydantic import BaseModel
class MyModel(BaseModel):
name: str
age: int
# Simulated LLM response content (string)
llm_response_content = '{"name": "Alice", "age": 30}'
# Incorrect: passing string directly to Pydantic
# model = MyModel.parse_obj(llm_response_content) # Raises validation error
# Correct: parse JSON string first
import json
parsed = json.loads(llm_response_content)
model = MyModel.parse_obj(parsed)
print(model) output
name='Alice' age=30
The fix
Always parse the LLM's JSON string output before validating with Pydantic. Use json.loads() on response.choices[0].message.content to convert it into a Python dict, then pass that dict to your Pydantic model.
This works because Pydantic expects a dict or keyword arguments, not a raw JSON string.
from openai import OpenAI
import os
import json
from pydantic import BaseModel
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
class User(BaseModel):
name: str
age: int
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Generate a JSON with name and age."}]
)
content = response.choices[0].message.content
# Parse JSON string from LLM output
parsed = json.loads(content)
# Validate with Pydantic
user = User.parse_obj(parsed)
print(user) output
name='Alice' age=30
Preventing it in production
To avoid runtime validation errors, always:
- Parse the LLM output JSON string before validation.
- Use try-except blocks around
json.loads()to catch malformed JSON. - Implement retries or fallback logic if the LLM returns invalid JSON.
- Consider using
response_formatorstructured_outputfeatures if your API supports them to enforce JSON output.
import json
from pydantic import ValidationError
try:
parsed = json.loads(content)
user = User.parse_obj(parsed)
except json.JSONDecodeError:
print("Invalid JSON from LLM")
except ValidationError as e:
print("Validation failed:", e) output
Invalid JSON from LLM # or detailed validation error message
Key Takeaways
- Always parse LLM JSON string output with json.loads() before Pydantic validation.
- Use try-except to handle malformed JSON or validation errors gracefully.
- Design LLM prompts to produce consistent, complete JSON for reliable structured outputs.