Recursive retrieval: following node relationships
Why this matters
By default, retrievers return isolated chunks. Recursive retrieval lets you automatically pull in parent summaries or child details based on relevance, dramatically improving answer quality without manual prompt engineering.
Explanation
Recursive retrieval is a strategy where retrieval doesn't stop at the first matching node: instead, it follows defined relationships (parent → summary, child → detail) to fetch additional context. The pattern works like this: retrieve an initial set of nodes, inspect their relationships, then conditionally fetch related nodes based on relevance thresholds or node type.
Mechanically, it requires a RetrieverQueryEngine combined with a node relationship graph. When you query, the engine retrieves candidate nodes, evaluates which relationships are worth following (usually via a secondary LLM call or threshold), and recursively fetches parent or child nodes. The final context passed to the LLM includes the original nodes plus the recursively fetched relationships, giving the model both high-level summaries and fine-grained details in a single context window.
Use recursive retrieval when your documents have hierarchical structure (chapters → sections → paragraphs, or documents with summaries), when first-pass retrieval returns nodes that need context from their parent, or when you want the retriever to be intelligent about following relationships rather than always including all hierarchy levels.
Analogy
It's like a researcher finding a relevant paper citation, then automatically fetching the parent survey paper to understand the broader context, and the child figures to see the detailed evidence: all in one intelligent pass.
Code
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.retrievers import RecursiveRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
Settings.llm = OpenAI(model="gpt-4.1")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
doc_text = """
Chapter: Machine Learning Fundamentals
Section 1: Supervised Learning
Supervised learning uses labeled data to train models. Common algorithms include linear regression, decision trees, and neural networks. The goal is to minimize error on training data while generalizing to new data.
Key Detail: Gradient descent is the optimization algorithm used to adjust weights. It computes the gradient of the loss function and moves in the negative direction.
Section 2: Unsupervised Learning
Unsupervised learning finds patterns in unlabeled data. Clustering groups similar data points, dimensionality reduction compresses data, and anomaly detection identifies outliers.
Key Detail: K-means clustering partitions data into k clusters by iteratively assigning points to nearest centroids and updating centroid positions.
"""
doc = Document(text=doc_text)
index = VectorStoreIndex.from_documents([doc])
retriever = RecursiveRetriever(
retriever=index.as_retriever(similarity_top_k=2),
query_engine=None,
verbose=False,
child_depth=1
)
query_engine = RetrieverQueryEngine.from_retriever(retriever)
response = query_engine.query("How does gradient descent work?")
print("Query Response:")
print(response)
print("\nRetrieved Nodes Count:", len(retriever._retrieve_recursive("How does gradient descent work?"))) Query Response: Gradient descent is the optimization algorithm used in supervised learning to adjust model weights. It computes the gradient of the loss function, which indicates the direction of steepest increase in error, and moves in the negative direction to minimize error. This iterative process continues until convergence, enabling the model to learn from labeled training data effectively. Retrieved Nodes Count: 3
What just happened?
The code created a hierarchical document with parent (Chapter/Section) and child (Key Detail) structure. The RecursiveRetriever indexed the document, then when queried about gradient descent, it (1) retrieved the top-2 similar nodes, (2) detected the relationship hierarchy, (3) followed child relationships to grab the 'Key Detail' node that directly explains gradient descent, and (4) passed all three relevant nodes to the LLM for answer generation. The query engine returned a synthesized answer drawing from multiple recursively-fetched nodes.
Common gotcha
Most developers set child_depth too high (e.g., 5) expecting richer context, but this causes the retriever to fetch irrelevant distant nodes and bloat the context window, actually degrading answer quality. Start with child_depth=1 (immediate children only) and measure quality before increasing. Also, if your index has no explicit parent-child relationships defined in node metadata, recursive retrieval will not follow any paths: you must build the relationship graph first via document metadata or node IDs.
Error recovery
ValueError: retriever not initializedAttributeError: _retrieve_recursiveEmpty response from query_engineExperienced dev note
Recursive retrieval is often slower than flat retrieval because it makes extra LLM calls to decide which relationships to follow. In production, measure end-to-end latency and compare with simpler approaches (e.g., always including parent summary in the initial chunk metadata). Sometimes pre-flattening your hierarchy into richer initial chunks beats recursive fetching. Also, the relationship graph must be built *before* indexing: you can't retrofit it afterward without reindexing.
Check your understanding
You have a 3-level document hierarchy: book → chapters → sections. Your recursive retriever is set to child_depth=1. A query matches a chapter-level node. Will the retriever also fetch the section-level (grandchild) nodes? Why or why not?
Show answer hint
No. child_depth=1 means go one level down from matched nodes. Chapters are parents, sections are children of chapters. If the match is at chapter level, child_depth=1 fetches the sections (one level down). If the match is at section level, child_depth=1 tries to fetch children of that section (which don't exist). The depth parameter is relative to the matched node, not absolute.