High severity intermediate · Fix: 2-5 min

OutputParserException

langchain.schema.output_parser.OutputParserException

What this error means

LangChain's Instructor fromOpenAI structured extraction parser failed because the LLM output did not match the expected structured format or schema.

Stack trace

traceback

langchain_core.exceptions.OutputParserException: Could not parse LLM output: `{
  "name": "John Doe",
  "age": "thirty",
  "email": "john.doe@example.com"
}`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "app.py", line 42, in <module>
    result = instructor.from_openai.extract(text)
  File "langchain_core/instructor.py", line 88, in from_openai
    return self.parser.parse(llm_response)
langchain_core.exceptions.OutputParserException: Failed to parse structured extraction output.

QUICK FIX

Add JsonOutputParser() or equivalent that automatically strips markdown fences and retries parsing on malformed JSON.

Why it happens

The Instructor fromOpenAI parser expects the LLM to return output strictly matching the defined structured schema, often a JSON object with exact field names and types. If the LLM returns extra text, markdown fences, or fields with incorrect types (e.g., string instead of int), the parser raises OutputParserException. This often happens if the prompt lacks clear instructions or the model is not instruction-tuned.

Detection

Wrap calls to the Instructor fromOpenAI parser in try/except OutputParserException and log the raw LLM output to detect format mismatches before the app crashes.

Causes & fixes

LLM output includes markdown fences or explanatory text around the JSON response

✓ Fix

Modify the prompt to instruct the model to return ONLY raw JSON without markdown fences or extra text, or use a parser that strips fences automatically.

Field names or data types in the LLM output do not match the Pydantic schema exactly

✓ Fix

Ensure the Pydantic model field names and types exactly match the expected output fields and types defined in the prompt.

Using a base LLM model that ignores or poorly follows output format instructions

✓ Fix

Switch to an instruction-tuned model such as gpt-4o-mini or claude-3-5-haiku-20241022 that reliably follows structured output instructions.

Code: broken vs fixed

Broken - triggers the error

python

from langchain_core.instructor import Instructor

instructor = Instructor.from_openai(model="gpt-4o")
text = "Extract user info"

# This line raises OutputParserException due to markdown fences in output
result = instructor.extract(text)
print(result)

Fixed - works correctly

python

import os
from langchain_core.instructor import Instructor
from langchain_core.output_parsers import JsonOutputParser

instructor = Instructor.from_openai(model="gpt-4o-mini", output_parser=JsonOutputParser())  # Added JsonOutputParser to handle fences
text = "Extract user info"

result = instructor.extract(text)  # Now parses correctly without error
print(result)

Added JsonOutputParser() which strips markdown fences and retries parsing malformed JSON, preventing OutputParserException.

⚠

Workaround

Catch OutputParserException, then extract the JSON substring from the raw LLM output using regex and parse it manually with json.loads() as a fallback.

✓

Prevention

Use OpenAI's structured response_format or Anthropic's tool use features to enforce schema-valid outputs at the API level, avoiding parser fragility and format mismatches.

Python 3.9+ · langchain-core >=0.1.0 · tested on 0.2.x

Verified 2026-04 · gpt-4o-mini, claude-3-5-haiku-20241022

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.