Intermediate Course
LangChain Intermediate
63 lessons across 9 chapters. Every lesson is standalone — start anywhere.
63 lessons 9 chapters
1 Conversation Memory 7 lessons
1
Why memory is needed: the stateless problem Each LLM call forgets everything before it, so you must explicitly store conversation history to build coherent multi-turn interactions.
2 ChatMessageHistory: storing messages manually Manually construct and manage conversation history by building lists of message objects instead of relying on automatic memory abstractions.
3 RunnableWithMessageHistory: adding memory to any chain RunnableWithMessageHistory wraps any LangChain chain to automatically manage conversation history without rewriting your logic.
4 InMemoryChatMessageHistory vs persistent storage InMemoryChatMessageHistory stores conversation in RAM and disappears on restart; persistent storage keeps it forever.
5 Session IDs: managing multiple conversations Use session IDs to maintain separate conversation histories for different users or threads within the same application.
6 Trimming message history: preventing context overflow Automatically drop or summarize old conversation messages to stay within token limits while preserving recent context.
7 Summarization memory: compressing long histories Use summarization memory to compress long conversation histories into a condensed summary, keeping tokens low while maintaining context.
2 Document Loading and Splitting 7 lessons
1
What document loaders do Document loaders convert files (PDF, TXT, JSON, web pages) into LangChain Document objects that LLMs can process.
2 PyPDFLoader: loading PDFs with metadata PyPDFLoader extracts text and metadata from PDF files, creating LangChain Document objects ready for RAG pipelines.
3 TextLoader and CSVLoader: plain text and tabular data Load plain text files and CSV data into LangChain documents for processing by LLMs.
4 WebBaseLoader: scraping web content WebBaseLoader fetches and parses HTML from URLs into LangChain documents for use in RAG pipelines.
5 RecursiveCharacterTextSplitter: the standard splitter RecursiveCharacterTextSplitter intelligently breaks long text into chunks while keeping semantic units together by trying progressively smaller separators.
6 Chunk size and overlap: tuning for retrieval quality Chunk size and overlap control how documents are split for retrieval: too small loses context, too large wastes computation, and overlap prevents splitting information mid-sentence.
7 Why split_documents() preserves metadata but split_text() doesn't split_documents() wraps text chunks in Document objects to carry metadata through the chunking pipeline, while split_text() returns raw strings that lose context.
3 Embeddings and Vector Stores 7 lessons
1
What embeddings are and why they enable search Embeddings convert text into numerical vectors that enable semantic search by measuring meaning similarity, not just keyword matches.
2 OpenAIEmbeddings: the standard choice OpenAIEmbeddings converts text into dense numerical vectors using OpenAI's models, enabling semantic search and similarity operations.
3 ChromaDB: local vector store for development ChromaDB is an in-process vector database that stores embeddings locally, letting you build and test RAG applications without cloud infrastructure.
4 Creating a vector store from documents Convert documents into embeddings and store them in a searchable vector database for semantic retrieval.
5 Similarity search: how retrieval works Similarity search finds the most relevant documents from a collection by measuring how semantically close they are to a query using vector embeddings.
6 k: tuning how many chunks to retrieve Control how many similar document chunks a retriever returns by setting the k parameter.
7 Persisting and reloading a vector store Save embeddings to disk after creation so you don't re-embed thousands of documents every time your application starts.
4 Retrieval Chains 7 lessons
1
The retrieval-augmented generation pattern RAG combines a retriever that fetches relevant documents with an LLM that generates answers, letting you ground responses in your own data.
2 Creating a retriever from a vector store Convert a vector store into a retriever that can fetch relevant documents using semantic search.
3 Building a retrieval chain with LCEL Chain a retriever directly to an LLM using LCEL's pipe operator to answer questions over documents.
4 Formatting retrieved documents for the prompt Transform raw retrieved documents into a single formatted string that fits cleanly into your LLM prompt without breaking context or token limits.
5 Adding conversation history to retrieval Maintain context across multiple turns by passing previous messages to your retrieval chain so the LLM can reference earlier exchanges.
6 create_retrieval_chain: the convenience wrapper <code>create_retrieval_chain</code> wraps your retriever and LLM into a single production-ready chain that handles document formatting and prompt templating automatically.
7 Debugging wrong answers: always check retrieval first When your RAG chain gives wrong answers, the bug is almost always in retrieval, not the LLM: here's how to isolate it.
5 Advanced Retrieval Patterns 7 lessons
1
MultiQueryRetriever: generating query variations to improve recall MultiQueryRetriever automatically generates multiple query reformulations to find more relevant documents when a single query might miss results.
2 ContextualCompressionRetriever: filtering irrelevant chunks after retrieval Filter out irrelevant document chunks after retrieval by using an LLM to compress and keep only the parts that answer your query.
3 ParentDocumentRetriever: indexing small chunks, returning large context Index small document chunks for fast retrieval while returning their full parent documents for richer context to the LLM.
4 SelfQueryRetriever: natural language metadata filtering SelfQueryRetriever lets LLMs translate natural language queries into structured metadata filters without you writing filter logic.
5 EnsembleRetriever: combining dense and sparse retrieval Combine keyword-based (sparse) and semantic (dense) retrieval to get better search results than either method alone.
6 Maximum marginal relevance: reducing redundancy in results MMR balances relevance and diversity by penalizing results that are too similar to already-selected ones, preventing your retriever from returning redundant documents.
7 Reranking retrieved documents with a cross-encoder Use a cross-encoder model to re-score and reorder retrieved documents by relevance before passing them to an LLM.
6 Tools and Tool Calling 7 lessons
1
What tools are in LangChain Tools in LangChain are callable objects that models can invoke to interact with the outside world: APIs, calculators, databases: and the framework provides structured patterns to define and bind them.
2 @tool decorator: wrapping a Python function Convert a plain Python function into a LangChain tool that LLMs and agents can discover and call automatically.
3 Tool name, description, and why descriptions matter Tool descriptions tell LLM agents what a tool does and when to use it: bad descriptions mean the agent never calls your tool or calls it wrong.
4 Binding tools to a model Attach tool definitions to an LLM so it can decide when and how to call them during a conversation.
5 How the model decides to call a tool The model uses tool schemas in the prompt to decide whether to call a tool and how to structure the call.
6 Parsing tool calls from model output Extract and parse structured tool calls that language models generate when responding to prompts with tool definitions.
7 Tool errors: handling failure gracefully Catch and recover from tool execution failures instead of crashing your agent loop.
7 Agents 7 lessons
1
What an agent is vs a chain A chain is a predetermined sequence of steps; an agent decides what steps to take based on reasoning over available tools.
2 create_react_agent: the standard agent pattern Build a reasoning agent that can use tools iteratively by combining ReAct prompting with LangGraph.
3 AgentExecutor: running the agent loop Use LangGraph to build and execute agents instead of the deprecated AgentExecutor pattern.
4 Agent scratchpad: how the agent thinks The scratchpad is the agent's working memory that tracks its thoughts, actions, and observations to reason through multi-step problems.
5 max_iterations: preventing infinite loops Set a hard limit on how many times an agent or chain can loop before stopping to prevent runaway execution and wasted API calls.
6 Verbose mode: watching the agent reason Enable verbose logging on LangChain agents to see the step-by-step reasoning, tool calls, and observations that happen inside the black box.
7 When to use an agent vs a chain Chains execute a fixed sequence of steps; agents decide which tools to use dynamically based on the input.
8 Structured Output 7 lessons
1
with_structured_output(): the modern pattern Use <code>with_structured_output()</code> to bind a Pydantic schema directly to an LLM and get validated Python objects back instead of parsing JSON strings yourself.
2 Defining output schema with Pydantic Use Pydantic models to enforce structured, type-safe outputs from LLMs instead of parsing unstructured strings.
3 JSON mode vs function calling mode JSON mode returns structured data as a string; function calling mode invokes tool definitions and returns parsed tool calls: choose based on whether you control the schema or need to invoke external actions.
4 Strict mode: enforcing schema compliance Use with_structured_output(mode='json_schema') to force LLMs to return only valid, typed data that matches your exact schema.
5 Nested schemas: complex structured output Define hierarchical Pydantic models to extract deeply nested structured data from unstructured text.
6 Handling partial or invalid structured output Use withFallbacks and output parsers with error handling to gracefully manage LLM responses that don't match your expected schema.
7 Streaming structured output Stream structured data (like JSON objects) token-by-token from an LLM while maintaining schema validation.
9 Configurable Chains and RunnableConfig 7 lessons
1
RunnableConfig: passing runtime config to any chain RunnableConfig lets you pass runtime settings (like callbacks, tags, and model parameters) to any chain without modifying its definition.
2 configurable_fields: making model parameters runtime-adjustable Use <code>configurable_fields</code> to expose chain parameters like temperature or model choice as runtime variables instead of hardcoding them.
3 configurable_alternatives: swapping entire chain components at runtime Use <code>configurable_alternatives</code> to swap entire chain branches (LLM, retriever, prompt) without rebuilding the chain.
4 Passing config through nested chains Use RunnableConfig to propagate settings like API keys, timeouts, and callbacks through deeply nested chain operations without modifying intermediate function signatures.
5 Using tags and metadata in RunnableConfig for tracing Attach tags and metadata to Runnable chains to organize and filter execution traces for debugging and monitoring.
6 Building multi-tenant chains with per-user config Pass user-specific configuration through LangChain chains without breaking the pipeline or exposing sensitive data across tenants.
7 Environment-specific configuration: dev vs staging vs prod Load API keys, model names, and chain behavior differently based on your deployment environment using environment variables and configuration classes.