Type annotations for chain inputs and outputs
Why this matters
In production, chains fail silently when they receive unexpected input shapes or produce unpredictable outputs. Type annotations catch these errors at development time, enable IDE autocomplete for chain inputs, and make debugging chain composition failures obvious instead of cryptic.
Explanation
LangChain chains in 1.2.x accept Runnable.invoke(input) where input can be a dict, Pydantic model, or primitive. Without type annotations, Python has no way to know what shape the chain expects, and you discover mismatches at runtime.
LangChain's Runnable base class supports input_schema and output_schema properties that automatically infer types from your code. When you pipe chains together with |, these schemas compose: if output type of chain A doesn't match input type of chain B, you catch the mismatch immediately, and tools like mypy or Pylance will flag it before execution.
Define input/output types using Pydantic BaseModel classes or Python type hints on prompt templates and custom runnables. The chain then validates inputs at invoke time and exposes the schema to downstream code for IDE autocomplete and static analysis.
Analogy
Think of it like a function signature in a statically-typed language. <code>def process(x: int) -> str</code> tells you exactly what goes in and what comes out. Without it, <code>def process(x)</code> in Python leaves everyone guessing. Pydantic models do for chain inputs/outputs what function signatures do for functions.
Code
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from typing import Optional
class ResearchRequest(BaseModel):
"""Input schema for research chain"""
topic: str = Field(description="Topic to research")
depth: Optional[str] = Field(default="moderate", description="Research depth: brief, moderate, or deep")
class ResearchResult(BaseModel):
"""Output schema for research chain"""
summary: str
key_findings: list[str]
template = """Research the topic: {topic} at {depth} depth.
Provide a summary and 3 key findings.
Format: SUMMARY: [text] | FINDINGS: [item1], [item2], [item3]"""
prompt = ChatPromptTemplate.from_template(template)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
class ParseResearchOutput(StrOutputParser):
"""Custom parser that enforces ResearchResult output schema"""
def invoke(self, input, config=None):
text = super().invoke(input, config)
lines = text.strip().split(" | ")
summary = lines[0].replace("SUMMARY: ", "") if len(lines) > 0 else ""
findings_text = lines[1].replace("FINDINGS: ", "") if len(lines) > 1 else ""
findings = [f.strip() for f in findings_text.split(",")]
return ResearchResult(summary=summary, key_findings=findings)
chain = prompt | llm | ParseResearchOutput()
# Invoke with typed input
request = ResearchRequest(topic="quantum computing", depth="moderate")
result = chain.invoke({"topic": request.topic, "depth": request.depth})
print(f"Type of result: {type(result)}")
print(f"Summary: {result.summary}")
print(f"Findings: {result.key_findings}")
print(f"\nInput schema: {prompt.input_schema}")
print(f"Output schema: {type(result).__name__}") Type of result: <class '__main__.ResearchResult'> Summary: Quantum computing uses quantum bits (qubits) to process information exponentially faster than classical computers for specific problems. Findings: ['Qubits leverage superposition and entanglement', 'IBM and Google lead quantum hardware development', 'Current systems have 100-1000 qubits with high error rates'] Input schema: input_schema='<schema details: topic (string, required), depth (string, optional, default=moderate)>' Output schema: ResearchResult
What just happened?
We defined two Pydantic models: <code>ResearchRequest</code> for inputs and <code>ResearchResult</code> for outputs. We built a chain with a custom parser that returns a typed <code>ResearchResult</code> object instead of a raw string. When we invoke the chain with a <code>ResearchRequest</code>, the LLM output is parsed into the <code>ResearchResult</code> type, and we access fields like <code>result.summary</code> with full IDE autocomplete. The input and output schemas are now explicit and queryable by downstream code.
Common gotcha
The most common mistake: defining Pydantic models but never actually enforcing them on the chain. You write class MyInput(BaseModel): ... and then still call chain.invoke({"raw": "dict"}) without validation. The models sit unused. You must either (1) validate input before calling invoke(), or (2) wrap the chain in a custom Runnable that enforces the schema at invoke time. Pydantic models don't auto-enforce unless you wire them explicitly into the chain.
Error recovery
ValidationErrorTypeError in custom parserAttributeError on chain resultExperienced dev note
Type annotations on chains are not optional in production: they're your contract. The moment you build a chain that another service or function depends on, add Pydantic models. It saves hours of debugging when the LLM returns an edge case that breaks downstream parsing. Also, chains with explicit schemas are composable: when you pipe two chains with mismatched schemas, mypy or your IDE will catch it before you run a single token. This is especially valuable in multi-step agentic flows where output from one step feeds into the next.
Check your understanding
Write a chain that takes two inputs (a product name and a price) and outputs structured data with a discount amount and final price. How would you ensure that the final price output is always a float, and that a downstream chain expecting exactly that float type gets IDE autocomplete for it?
Show answer hint
Your answer should cover: (1) using a Pydantic BaseModel with <code>price: float</code> and <code>discount: float</code> in the output schema, (2) creating a custom parser or Runnable that returns an instance of that model (not a dict or string), and (3) piping that chain to a downstream chain that accepts the output model as input, so the types align and compose cleanly.
LLMChain.predict() was removed in 0.1.0. Type annotations work via input_schema and output_schema properties on Runnables, introduced in langchain-core 0.1.x and stable in 0.3.x.