OutputParserException
langchain.schema.output_parser.OutputParserException
Stack trace
langchain_core.exceptions.OutputParserException: Could not parse LLM output: `{
"name": "John Doe",
"age": "thirty",
"email": "john.doe@example.com"
}`
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "app.py", line 42, in <module>
result = instructor.from_openai.extract(text)
File "langchain_core/instructor.py", line 88, in from_openai
return self.parser.parse(llm_response)
langchain_core.exceptions.OutputParserException: Failed to parse structured extraction output. Why it happens
The Instructor fromOpenAI parser expects the LLM to return output strictly matching the defined structured schema, often a JSON object with exact field names and types. If the LLM returns extra text, markdown fences, or fields with incorrect types (e.g., string instead of int), the parser raises OutputParserException. This often happens if the prompt lacks clear instructions or the model is not instruction-tuned.
Detection
Wrap calls to the Instructor fromOpenAI parser in try/except OutputParserException and log the raw LLM output to detect format mismatches before the app crashes.
Causes & fixes
LLM output includes markdown fences or explanatory text around the JSON response
Modify the prompt to instruct the model to return ONLY raw JSON without markdown fences or extra text, or use a parser that strips fences automatically.
Field names or data types in the LLM output do not match the Pydantic schema exactly
Ensure the Pydantic model field names and types exactly match the expected output fields and types defined in the prompt.
Using a base LLM model that ignores or poorly follows output format instructions
Switch to an instruction-tuned model such as gpt-4o-mini or claude-3-5-haiku-20241022 that reliably follows structured output instructions.
Code: broken vs fixed
from langchain_core.instructor import Instructor
instructor = Instructor.from_openai(model="gpt-4o")
text = "Extract user info"
# This line raises OutputParserException due to markdown fences in output
result = instructor.extract(text)
print(result) import os
from langchain_core.instructor import Instructor
from langchain_core.output_parsers import JsonOutputParser
instructor = Instructor.from_openai(model="gpt-4o-mini", output_parser=JsonOutputParser()) # Added JsonOutputParser to handle fences
text = "Extract user info"
result = instructor.extract(text) # Now parses correctly without error
print(result) Workaround
Catch OutputParserException, then extract the JSON substring from the raw LLM output using regex and parse it manually with json.loads() as a fallback.
Prevention
Use OpenAI's structured response_format or Anthropic's tool use features to enforce schema-valid outputs at the API level, avoiding parser fragility and format mismatches.