How to use structured output with LangChain
Quick answer
Use LangChain's
StructuredOutputParser with a defined output_schema to parse AI responses into structured Python objects or JSON. This ensures consistent, machine-readable outputs from models like gpt-4o.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install langchain_openai>=0.2 openai>=1.0 pydantic
Setup
Install the required packages and set your OpenAI API key in the environment.
- Install LangChain OpenAI bindings and Pydantic for schema validation:
pip install langchain_openai openai pydantic Step by step
Define a Pydantic model as your output schema, create a StructuredOutputParser with it, then use the parser's format_prompt to generate a prompt that instructs the model to respond in the structured format. Finally, parse the model's response back into the schema.
import os
from pydantic import BaseModel
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StructuredOutputParser
# Define the structured output schema
class MovieInfo(BaseModel):
title: str
year: int
rating: float
# Initialize the parser with the schema
parser = StructuredOutputParser.from_pydantic(MovieInfo)
# Format the prompt with instructions for structured output
prompt = parser.format_prompt(
"Provide movie information for 'Inception'."
).to_string()
# Initialize OpenAI client
client = ChatOpenAI(model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"])
# Call the model
response = client.chat.completions.create(
messages=[{"role": "user", "content": prompt}]
)
# Extract the text
text = response.choices[0].message.content
# Parse the structured output
movie_info = parser.parse(text)
print(movie_info)
print(f"Title: {movie_info.title}, Year: {movie_info.year}, Rating: {movie_info.rating}") output
title='Inception' year=2010 rating=8.8 Title: Inception, Year: 2010, Rating: 8.8
Common variations
You can use different models like gpt-4.1 or claude-3-5-sonnet-20241022 with similar structured output parsing. Async calls are supported by LangChain's async clients. You can also customize the output schema for nested or complex data structures.
import asyncio
async def async_example():
from langchain_openai import AsyncChatOpenAI
client = AsyncChatOpenAI(model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"])
prompt = parser.format_prompt("Provide movie info for 'The Matrix'.").to_string()
response = await client.chat.completions.acreate(
messages=[{"role": "user", "content": prompt}]
)
text = response.choices[0].message.content
movie_info = parser.parse(text)
print(movie_info)
# To run async example:
# asyncio.run(async_example()) Troubleshooting
- If parsing fails, ensure the model output strictly follows the schema format by setting
temperature=0and usingStructuredOutputParser.format_promptinstructions. - Check for trailing text or explanations in the model output that can break parsing; you may need to adjust prompt instructions.
- Verify your API key is set correctly in
os.environ["OPENAI_API_KEY"].
Key Takeaways
- Use LangChain's StructuredOutputParser with Pydantic schemas for reliable structured AI outputs.
- Always format prompts with parser.format_prompt to instruct the model on output format.
- Set model temperature to 0 to reduce output randomness and improve parse accuracy.