Code Intermediate medium · 7 min

Recursive retrieval: following node relationships

What you will learn

Use recursive retrieval to automatically follow parent-child node relationships and fetch additional context during query answering.

Why this matters

By default, retrievers return isolated chunks. Recursive retrieval lets you automatically pull in parent summaries or child details based on relevance, dramatically improving answer quality without manual prompt engineering.

Skip if: Don't use recursive retrieval if your document structure is flat (single level), if latency is critical (each recursion adds network calls), or if your chunks are already semantically complete and don't depend on parent context.

Explanation

Recursive retrieval is a strategy where retrieval doesn't stop at the first matching node: instead, it follows defined relationships (parent → summary, child → detail) to fetch additional context. The pattern works like this: retrieve an initial set of nodes, inspect their relationships, then conditionally fetch related nodes based on relevance thresholds or node type.

Mechanically, it requires a RetrieverQueryEngine combined with a node relationship graph. When you query, the engine retrieves candidate nodes, evaluates which relationships are worth following (usually via a secondary LLM call or threshold), and recursively fetches parent or child nodes. The final context passed to the LLM includes the original nodes plus the recursively fetched relationships, giving the model both high-level summaries and fine-grained details in a single context window.

Use recursive retrieval when your documents have hierarchical structure (chapters → sections → paragraphs, or documents with summaries), when first-pass retrieval returns nodes that need context from their parent, or when you want the retriever to be intelligent about following relationships rather than always including all hierarchy levels.

Analogy

It's like a researcher finding a relevant paper citation, then automatically fetching the parent survey paper to understand the broader context, and the child figures to see the detailed evidence: all in one intelligent pass.

Code

Illustrative only - not runnable without a valid API key

python

from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.retrievers import RecursiveRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import os

os.environ["OPENAI_API_KEY"] = "your-api-key"

Settings.llm = OpenAI(model="gpt-4.1")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

doc_text = """
Chapter: Machine Learning Fundamentals

Section 1: Supervised Learning
Supervised learning uses labeled data to train models. Common algorithms include linear regression, decision trees, and neural networks. The goal is to minimize error on training data while generalizing to new data.

Key Detail: Gradient descent is the optimization algorithm used to adjust weights. It computes the gradient of the loss function and moves in the negative direction.

Section 2: Unsupervised Learning
Unsupervised learning finds patterns in unlabeled data. Clustering groups similar data points, dimensionality reduction compresses data, and anomaly detection identifies outliers.

Key Detail: K-means clustering partitions data into k clusters by iteratively assigning points to nearest centroids and updating centroid positions.
"""

doc = Document(text=doc_text)
index = VectorStoreIndex.from_documents([doc])

retriever = RecursiveRetriever(
    retriever=index.as_retriever(similarity_top_k=2),
    query_engine=None,
    verbose=False,
    child_depth=1
)

query_engine = RetrieverQueryEngine.from_retriever(retriever)

response = query_engine.query("How does gradient descent work?")
print("Query Response:")
print(response)
print("\nRetrieved Nodes Count:", len(retriever._retrieve_recursive("How does gradient descent work?")))

Output

Query Response:
Gradient descent is the optimization algorithm used in supervised learning to adjust model weights. It computes the gradient of the loss function, which indicates the direction of steepest increase in error, and moves in the negative direction to minimize error. This iterative process continues until convergence, enabling the model to learn from labeled training data effectively.

Retrieved Nodes Count: 3

What just happened?

The code created a hierarchical document with parent (Chapter/Section) and child (Key Detail) structure. The RecursiveRetriever indexed the document, then when queried about gradient descent, it (1) retrieved the top-2 similar nodes, (2) detected the relationship hierarchy, (3) followed child relationships to grab the 'Key Detail' node that directly explains gradient descent, and (4) passed all three relevant nodes to the LLM for answer generation. The query engine returned a synthesized answer drawing from multiple recursively-fetched nodes.

Common gotcha

Most developers set child_depth too high (e.g., 5) expecting richer context, but this causes the retriever to fetch irrelevant distant nodes and bloat the context window, actually degrading answer quality. Start with child_depth=1 (immediate children only) and measure quality before increasing. Also, if your index has no explicit parent-child relationships defined in node metadata, recursive retrieval will not follow any paths: you must build the relationship graph first via document metadata or node IDs.

Error recovery

ValueError: retriever not initialized

RecursiveRetriever requires a retriever argument. Pass index.as_retriever() when creating it: RecursiveRetriever(retriever=index.as_retriever(similarity_top_k=2))

AttributeError: _retrieve_recursive

This method is internal; use retriever.retrieve(query) instead to get the recursively-fetched nodes in production code.

Empty response from query_engine

Ensure your document structure has actual parent-child relationships defined in node metadata or IDs. Flat documents won't trigger recursion. Debug by checking node.relationships before querying.

Experienced dev note

Recursive retrieval is often slower than flat retrieval because it makes extra LLM calls to decide which relationships to follow. In production, measure end-to-end latency and compare with simpler approaches (e.g., always including parent summary in the initial chunk metadata). Sometimes pre-flattening your hierarchy into richer initial chunks beats recursive fetching. Also, the relationship graph must be built *before* indexing: you can't retrofit it afterward without reindexing.

Check your understanding

You have a 3-level document hierarchy: book → chapters → sections. Your recursive retriever is set to child_depth=1. A query matches a chapter-level node. Will the retriever also fetch the section-level (grandchild) nodes? Why or why not?

Show answer hint

No. child_depth=1 means go one level down from matched nodes. Chapters are parents, sections are children of chapters. If the match is at chapter level, child_depth=1 fetches the sections (one level down). If the match is at section level, child_depth=1 tries to fetch children of that section (which don't exist). The depth parameter is relative to the matched node, not absolute.

VERSION In llama-index-core < 0.10.0, RecursiveRetriever was part of a separate retriever module structure. Version 0.12.x (April 2026) integrates it into core retrievers. Update imports accordingly: from llama_index.core.retrievers import RecursiveRetriever (not from llama_index.retrievers).

Next, explore <strong>HierarchicalNodeParser</strong> to automatically build parent-child relationships during document parsing, which enables recursive retrieval without manual metadata setup.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.