Intermediate Course

LangChain Intermediate

63 lessons across 9 chapters. Every lesson is standalone — start anywhere.

63 lessons 9 chapters

1 Conversation Memory 7 lessons

Why memory is needed: the stateless problem Each LLM call forgets everything before it, so you must explicitly store conversation history to build coherent multi-turn interactions.

ChatMessageHistory: storing messages manually Manually construct and manage conversation history by building lists of message objects instead of relying on automatic memory abstractions.

RunnableWithMessageHistory: adding memory to any chain RunnableWithMessageHistory wraps any LangChain chain to automatically manage conversation history without rewriting your logic.

InMemoryChatMessageHistory vs persistent storage InMemoryChatMessageHistory stores conversation in RAM and disappears on restart; persistent storage keeps it forever.

Session IDs: managing multiple conversations Use session IDs to maintain separate conversation histories for different users or threads within the same application.

Trimming message history: preventing context overflow Automatically drop or summarize old conversation messages to stay within token limits while preserving recent context.

Summarization memory: compressing long histories Use summarization memory to compress long conversation histories into a condensed summary, keeping tokens low while maintaining context.

2 Document Loading and Splitting 7 lessons

What document loaders do Document loaders convert files (PDF, TXT, JSON, web pages) into LangChain Document objects that LLMs can process.

PyPDFLoader: loading PDFs with metadata PyPDFLoader extracts text and metadata from PDF files, creating LangChain Document objects ready for RAG pipelines.

TextLoader and CSVLoader: plain text and tabular data Load plain text files and CSV data into LangChain documents for processing by LLMs.

WebBaseLoader: scraping web content WebBaseLoader fetches and parses HTML from URLs into LangChain documents for use in RAG pipelines.

RecursiveCharacterTextSplitter: the standard splitter RecursiveCharacterTextSplitter intelligently breaks long text into chunks while keeping semantic units together by trying progressively smaller separators.

Chunk size and overlap: tuning for retrieval quality Chunk size and overlap control how documents are split for retrieval: too small loses context, too large wastes computation, and overlap prevents splitting information mid-sentence.

Why split_documents() preserves metadata but split_text() doesn't split_documents() wraps text chunks in Document objects to carry metadata through the chunking pipeline, while split_text() returns raw strings that lose context.

3 Embeddings and Vector Stores 7 lessons

What embeddings are and why they enable search Embeddings convert text into numerical vectors that enable semantic search by measuring meaning similarity, not just keyword matches.

OpenAIEmbeddings: the standard choice OpenAIEmbeddings converts text into dense numerical vectors using OpenAI's models, enabling semantic search and similarity operations.

ChromaDB: local vector store for development ChromaDB is an in-process vector database that stores embeddings locally, letting you build and test RAG applications without cloud infrastructure.

Creating a vector store from documents Convert documents into embeddings and store them in a searchable vector database for semantic retrieval.

k: tuning how many chunks to retrieve Control how many similar document chunks a retriever returns by setting the k parameter.

Persisting and reloading a vector store Save embeddings to disk after creation so you don't re-embed thousands of documents every time your application starts.

4 Retrieval Chains 7 lessons

The retrieval-augmented generation pattern RAG combines a retriever that fetches relevant documents with an LLM that generates answers, letting you ground responses in your own data.

Creating a retriever from a vector store Convert a vector store into a retriever that can fetch relevant documents using semantic search.

Building a retrieval chain with LCEL Chain a retriever directly to an LLM using LCEL's pipe operator to answer questions over documents.

Formatting retrieved documents for the prompt Transform raw retrieved documents into a single formatted string that fits cleanly into your LLM prompt without breaking context or token limits.

Adding conversation history to retrieval Maintain context across multiple turns by passing previous messages to your retrieval chain so the LLM can reference earlier exchanges.

create_retrieval_chain: the convenience wrapper <code>create_retrieval_chain</code> wraps your retriever and LLM into a single production-ready chain that handles document formatting and prompt templating automatically.

Debugging wrong answers: always check retrieval first When your RAG chain gives wrong answers, the bug is almost always in retrieval, not the LLM: here's how to isolate it.

5 Advanced Retrieval Patterns 7 lessons

MultiQueryRetriever: generating query variations to improve recall MultiQueryRetriever automatically generates multiple query reformulations to find more relevant documents when a single query might miss results.

ContextualCompressionRetriever: filtering irrelevant chunks after retrieval Filter out irrelevant document chunks after retrieval by using an LLM to compress and keep only the parts that answer your query.

ParentDocumentRetriever: indexing small chunks, returning large context Index small document chunks for fast retrieval while returning their full parent documents for richer context to the LLM.

SelfQueryRetriever: natural language metadata filtering SelfQueryRetriever lets LLMs translate natural language queries into structured metadata filters without you writing filter logic.

EnsembleRetriever: combining dense and sparse retrieval Combine keyword-based (sparse) and semantic (dense) retrieval to get better search results than either method alone.

Maximum marginal relevance: reducing redundancy in results MMR balances relevance and diversity by penalizing results that are too similar to already-selected ones, preventing your retriever from returning redundant documents.

Reranking retrieved documents with a cross-encoder Use a cross-encoder model to re-score and reorder retrieved documents by relevance before passing them to an LLM.

6 Tools and Tool Calling 7 lessons

What tools are in LangChain Tools in LangChain are callable objects that models can invoke to interact with the outside world: APIs, calculators, databases: and the framework provides structured patterns to define and bind them.

@tool decorator: wrapping a Python function Convert a plain Python function into a LangChain tool that LLMs and agents can discover and call automatically.

Tool name, description, and why descriptions matter Tool descriptions tell LLM agents what a tool does and when to use it: bad descriptions mean the agent never calls your tool or calls it wrong.

Binding tools to a model Attach tool definitions to an LLM so it can decide when and how to call them during a conversation.

How the model decides to call a tool The model uses tool schemas in the prompt to decide whether to call a tool and how to structure the call.

Parsing tool calls from model output Extract and parse structured tool calls that language models generate when responding to prompts with tool definitions.

Tool errors: handling failure gracefully Catch and recover from tool execution failures instead of crashing your agent loop.

7 Agents 7 lessons

What an agent is vs a chain A chain is a predetermined sequence of steps; an agent decides what steps to take based on reasoning over available tools.

create_react_agent: the standard agent pattern Build a reasoning agent that can use tools iteratively by combining ReAct prompting with LangGraph.

AgentExecutor: running the agent loop Use LangGraph to build and execute agents instead of the deprecated AgentExecutor pattern.

Agent scratchpad: how the agent thinks The scratchpad is the agent's working memory that tracks its thoughts, actions, and observations to reason through multi-step problems.

max_iterations: preventing infinite loops Set a hard limit on how many times an agent or chain can loop before stopping to prevent runaway execution and wasted API calls.

Verbose mode: watching the agent reason Enable verbose logging on LangChain agents to see the step-by-step reasoning, tool calls, and observations that happen inside the black box.

When to use an agent vs a chain Chains execute a fixed sequence of steps; agents decide which tools to use dynamically based on the input.

8 Structured Output 7 lessons

with_structured_output(): the modern pattern Use <code>with_structured_output()</code> to bind a Pydantic schema directly to an LLM and get validated Python objects back instead of parsing JSON strings yourself.

Defining output schema with Pydantic Use Pydantic models to enforce structured, type-safe outputs from LLMs instead of parsing unstructured strings.

JSON mode vs function calling mode JSON mode returns structured data as a string; function calling mode invokes tool definitions and returns parsed tool calls: choose based on whether you control the schema or need to invoke external actions.

Strict mode: enforcing schema compliance Use with_structured_output(mode='json_schema') to force LLMs to return only valid, typed data that matches your exact schema.

Nested schemas: complex structured output Define hierarchical Pydantic models to extract deeply nested structured data from unstructured text.

Handling partial or invalid structured output Use withFallbacks and output parsers with error handling to gracefully manage LLM responses that don't match your expected schema.

Streaming structured output Stream structured data (like JSON objects) token-by-token from an LLM while maintaining schema validation.

9 Configurable Chains and RunnableConfig 7 lessons

RunnableConfig: passing runtime config to any chain RunnableConfig lets you pass runtime settings (like callbacks, tags, and model parameters) to any chain without modifying its definition.

configurable_fields: making model parameters runtime-adjustable Use <code>configurable_fields</code> to expose chain parameters like temperature or model choice as runtime variables instead of hardcoding them.

configurable_alternatives: swapping entire chain components at runtime Use <code>configurable_alternatives</code> to swap entire chain branches (LLM, retriever, prompt) without rebuilding the chain.

Passing config through nested chains Use RunnableConfig to propagate settings like API keys, timeouts, and callbacks through deeply nested chain operations without modifying intermediate function signatures.

Using tags and metadata in RunnableConfig for tracing Attach tags and metadata to Runnable chains to organize and filter execution traces for debugging and monitoring.

Building multi-tenant chains with per-user config Pass user-specific configuration through LangChain chains without breaking the pipeline or exposing sensitive data across tenants.

Environment-specific configuration: dev vs staging vs prod Load API keys, model names, and chain behavior differently based on your deployment environment using environment variables and configuration classes.