How to use with_structured_output in LangChain
Quick answer
Use
with_structured_output in LangChain to define a structured output parser that converts AI-generated text into Python objects like dicts or dataclasses. This enables reliable extraction of structured data from LLM responses by specifying output schemas and parsing rules.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install langchain_openai>=0.2.0
Setup
Install the necessary LangChain package and set your OpenAI API key as an environment variable.
- Install LangChain OpenAI integration:
pip install langchain_openai Step by step
This example shows how to use with_structured_output with LangChain's ChatOpenAI to parse a JSON response into a Python dictionary.
import os
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import with_structured_output
# Define a simple output parser that expects a JSON object with specific keys
output_parser = with_structured_output(
output_schema={
"name": str,
"age": int,
"email": str
}
)
# Initialize the chat model
chat = ChatOpenAI(model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"])
# Define the prompt
prompt = """
Provide the following user info as a JSON object with keys: name, age, email.
"""
# Generate the response
response = # Use chat.invoke([HumanMessage(content='...')]) for LangChain ChatOpenAI
messages=[{"role": "user", "content": prompt}]
)
# Extract raw text
raw_text = response.choices[0].message.content
# Parse the structured output
parsed_output = output_parser.parse(raw_text)
print("Raw response:", raw_text)
print("Parsed output:", parsed_output) output
Raw response: {"name": "Alice", "age": 30, "email": "alice@example.com"}
Parsed output: {'name': 'Alice', 'age': 30, 'email': 'alice@example.com'} Common variations
You can use with_structured_output with different output schemas such as nested dicts or dataclasses. It also supports async calls and other LLM models supported by LangChain.
- Use dataclasses for output schema.
- Call
chat.invoke()with async for async usage. - Switch models by changing
modelparameter.
from dataclasses import dataclass
from langchain_core.output_parsers import with_structured_output
@dataclass
class UserInfo:
name: str
age: int
email: str
output_parser = with_structured_output(output_schema=UserInfo)
# Use the same chat client and prompt as before
# After getting raw_text from the model:
parsed_output = output_parser.parse(raw_text)
print(parsed_output) # UserInfo(name='Alice', age=30, email='alice@example.com') output
UserInfo(name='Alice', age=30, email='alice@example.com')
Troubleshooting
- If parsing fails with a
ValueError, ensure the model output matches the expected schema format exactly. - Use
temperature=0to reduce randomness and improve structured output consistency. - Validate the raw output before parsing to handle unexpected text or formatting.
Key Takeaways
- Use
with_structured_outputto define strict output schemas for reliable data extraction from LLM responses. - Set
temperature=0in your chat model to improve consistency of structured outputs. - You can parse outputs into Python dicts or dataclasses for easy downstream processing.