Code Advanced medium · 6 min

QueryEngineTool: tools from any query engine

What you will learn

Wrap any query engine as a callable tool so agents can invoke it with natural language inputs and receive structured results.

Why this matters

Building agent systems requires converting your domain-specific query engines into tools that agents can discover, invoke, and chain. QueryEngineTool is the bridge between your RAG pipelines and multi-step reasoning workflows.

Skip if: Don't use QueryEngineTool if you're building a simple single-query application or if your query engine doesn't benefit from being part of a multi-tool agent loop. Also avoid it if the query engine's interface doesn't cleanly map to function arguments.

Explanation

What it is: QueryEngineTool wraps a LlamaIndex query engine and exposes it as a callable tool with a defined schema. An agent can then invoke this tool as part of its reasoning loop, passing natural language queries and receiving responses.

How it works mechanically: When you instantiate QueryEngineTool, you pass a query_engine instance, a metadata object (with name and description), and optionally a return_direct flag. The agent's tool-calling mechanism reads the metadata to decide whether to invoke the tool, passes the user's query as input, and the tool executes query_engine.query(input) internally. The response is returned to the agent for further reasoning or final output.

When to use it: Use this when you have one or more specialized query engines (RAG over documents, SQL database, API, etc.) and want an agent to decide which tool to use based on the task. It's essential for multi-tool agent architectures where the agent coordinates between different knowledge sources.

Analogy

Think of QueryEngineTool like wrapping a subject-matter expert as a service worker in an office. The agent is the manager asking questions; the tool is the wrapped expert who only answers questions in their domain. The metadata is the expert's job title and brief biography: it tells the manager when to call this person.

Code

Illustrative only - not runnable without a valid API key

python

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.tools import QueryEngineTool
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
import os

os.environ["OPENAI_API_KEY"] = "sk-your-key-here"

Settings.llm = OpenAI(model="gpt-4.1")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

from llama_index.core.schema import Document

docs = [
    Document(text="The solar system consists of the Sun and eight planets. Mercury is the closest to the Sun."),
    Document(text="Jupiter is the largest planet and has a Great Red Spot. Saturn is known for its rings."),
    Document(text="Earth orbits the Sun every 365.25 days. The Moon orbits Earth every 27.3 days."),
]

index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()

tool = QueryEngineTool(
    query_engine=query_engine,
    metadata={
        "name": "astronomy_qa",
        "description": "Answers questions about planets, moons, and the solar system based on an astronomy knowledge base."
    },
    return_direct=True
)

print(f"Tool name: {tool.metadata.name}")
print(f"Tool description: {tool.metadata.description}")
print(f"Tool function schema: {tool.metadata.fn_schema}")
print()

result = tool("What is the Great Red Spot?")
print(f"Result type: {type(result)}")
print(f"Result: {result}")

Output

Tool name: astronomy_qa
Tool description: Answers questions about planets, moons, and the solar system based on an astronomy knowledge base.
Tool function schema: <StructuredTool metadata=ToolMetadata(name='astronomy_qa', description='Answers questions about planets, moons, and the solar system based on an astronomy knowledge base.', fn_schema=<class 'pydantic.main.ModelMetaclass'>)>

Result type: <class 'llama_index.core.schema.Response'>
Result: Jupiter is the largest planet and has a Great Red Spot.

What just happened?

We created a VectorStoreIndex from astronomy documents, extracted it as a query engine, then wrapped that query engine in a QueryEngineTool with metadata describing its purpose. When we called the tool with a question string, it internally invoked the query engine and returned the response object. The metadata is now available for an agent to read when deciding whether to use this tool.

Common gotcha

The return_direct=True parameter controls whether the tool's response bypasses the agent's reasoning loop. If return_direct=True, the agent immediately returns the tool result to the user without further processing. If return_direct=False, the agent sees the result and may reason further or call other tools. This matters more than it appears: setting it wrong causes either premature termination of the agent loop or unnecessary extra LLM calls.

Error recovery

ValueError: Query engine must be a QueryEngine instance

You passed something that isn't a query engine (e.g., an index instead of query_engine). Fix: call <code>index.as_query_engine()</code> before passing it to QueryEngineTool.

TypeError: metadata must be a ToolMetadata object or dict

The metadata parameter structure is wrong. Fix: pass a dict with at least 'name' and 'description' keys, or pass a proper ToolMetadata object.

AttributeError: 'QueryEngineTool' object has no attribute 'metadata'

You're trying to access metadata before the tool is fully initialized. Fix: ensure QueryEngineTool is instantiated and assigned to a variable before accessing <code>tool.metadata</code>.

Experienced dev note

The distinction between return_direct=True and return_direct=False is critical in multi-tool agent scenarios. Most developers set it to True and wonder why their agent stops reasoning after one tool call. Use return_direct=False (the default in newer versions) when the agent should coordinate between multiple tools; use return_direct=True only when this tool's answer is the final answer and no further reasoning is needed. Also: if your query engine is expensive to call, instrument it with logging *before* wrapping it in QueryEngineTool: debugging tool invocations is harder because the agent controls the call.

Check your understanding

You have two query engines: one for customer support tickets (indexed by ticket ID and subject) and one for product documentation (indexed by feature). You want to build an agent that can route incoming questions to the right engine. Should you create one QueryEngineTool wrapping both engines, or two separate QueryEngineTool instances? Why?

Show answer hint

A correct answer recognizes that agents select tools by reading metadata and making independent decisions. Two separate tools with distinct names and descriptions let the agent's LLM read the metadata and route dynamically. One tool wrapping both engines forces you to handle routing logic inside the query engine itself, which defeats the purpose of having an agent. The key insight: QueryEngineTool is about *letting the agent choose*.

VERSION In llama-index-core < 0.10.0, QueryEngineTool was located in llama_index.tools. As of 0.10.0+, it moved to llama_index.core.tools. Also, the return_direct parameter's default changed from True to False in 0.11.0, so explicitly set it if upgrading.

Once you've wrapped query engines as tools, the next step is understanding how <strong>OpenAI Function Calling</strong> in llama-index agents actually dispatches tool calls and handles response parsing.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.