as_query_engine(): the simplest pattern
Why this matters
This is the most direct path from 'I have documents' to 'I can ask questions about them': no complex chains, no manual retrieval setup, just one line that handles the entire Q&A flow.
Explanation
What it is: as_query_engine() is a convenience method on any LlamaIndex index that instantly converts it into a query engine: an object that accepts natural language questions and returns answers by retrieving and synthesizing documents. How it works: When you call as_query_engine(), it takes the index you built, wraps it with a retrieval-and-synthesis pipeline, and returns a callable object. When you invoke .query("your question") on it, the engine retrieves relevant documents from the index and passes them to an LLM to generate an answer. When to use it: Use this for straightforward question-answering over documents: customer support FAQs, documentation search, knowledge base queries. It's the entry point most developers should start with before optimizing.
Analogy
Think of <code>as_query_engine()</code> as turning a library (your index) into a librarian (the query engine). You don't manage the librarian's every step: you just ask a question and get an answer back.
Code
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
Settings.llm = OpenAI(model='gpt-4.1')
documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query('What is the main topic of these documents?')
print(f'Answer: {response}')
print(f'Type: {type(response)}') Answer: The main topic of these documents is artificial intelligence and machine learning fundamentals. Type: <class 'llama_index.core.schema.Response'>
What just happened?
The code created a vector store index from documents in the 'data' folder, then called <code>as_query_engine()</code> to convert it into a query engine. When <code>.query()</code> was invoked with a question, the engine retrieved the top-k most relevant documents, passed them to GPT-4.1, and returned a <code>Response</code> object containing the synthesized answer as a string.
Common gotcha
Developers assume response is a plain string and try response.split() or len(response): but response is a Response object. Access the text with str(response) or response.response to get the actual string.
Error recovery
FileNotFoundErrorValueError: Model must be specifiedAuthenticationErrorExperienced dev note
The Response object contains metadata beyond just the answer text: response.source_nodes gives you the actual document chunks that were used to generate the answer, letting you cite sources or debug retrieval quality. Always inspect .source_nodes in development: it catches when retrieval failed silently and the LLM hallucinated instead.
Check your understanding
If you called as_query_engine() on an index but the answers seem generic or don't reference your documents, what could be wrong and how would you verify it?
Show answer hint
The answer should involve checking <code>response.source_nodes</code> to confirm the retriever is actually pulling relevant documents. If <code>source_nodes</code> is empty or contains irrelevant chunks, the problem is retrieval quality (embedding model mismatch, too few documents, or bad similarity thresholds), not the query engine itself.
GPTVectorStoreIndex. As of 0.10.0+, all index types inherit as_query_engine(), and the old class names are deprecated. Use VectorStoreIndex and Settings instead.