How to beginner · 3 min read

How to use LlamaIndex with FastAPI

Q: How to use LlamaIndex with FastAPI

Use LlamaIndex to build an index from your documents and expose query endpoints via FastAPI. Load documents, create an index with LlamaIndex, then serve queries asynchronously through FastAPI routes.

Quick answer

Use LlamaIndex to build an index from your documents and expose query endpoints via FastAPI. Load documents, create an index with LlamaIndex, then serve queries asynchronously through FastAPI routes.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install llama-index fastapi uvicorn openai

Setup

Install the required packages and set your OpenAI API key as an environment variable.

bash

pip install llama-index fastapi uvicorn openai

Step by step

This example shows how to create a simple FastAPI app that uses LlamaIndex to index documents and answer queries.

python

import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from llama_index import SimpleDirectoryReader, GPTSimpleVectorIndex

app = FastAPI()

# Load documents and create index at startup
documents = SimpleDirectoryReader('docs').load_data()
index = GPTSimpleVectorIndex(documents)

class QueryRequest(BaseModel):
    query: str

@app.post('/query')
async def query_index(request: QueryRequest):
    try:
        response = index.query(request.query)
        return {'response': response.response}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run: uvicorn main:app --reload

output

INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Common variations

Use GPTVectorStoreIndex for more advanced vector indexing.
Load documents from other sources like PDFs or web pages.
Integrate async calls if using async-compatible LlamaIndex methods.
Switch to other OpenAI models by setting environment variables or passing model names.

Troubleshooting

If you get OpenAI API key missing, ensure OPENAI_API_KEY is set in your environment.
For ModuleNotFoundError, verify all packages are installed.
If queries return empty, check your document loading path and content.

✅

Key Takeaways

Use LlamaIndex to build document indexes easily for FastAPI apps.
Load documents once at startup to optimize query performance.
Expose a POST endpoint in FastAPI to handle user queries asynchronously.
Set your OPENAI_API_KEY in environment variables to authenticate API calls.
Customize indexing and document loading based on your data source and use case.

Verified 2026-04 · gpt-4o, GPTSimpleVectorIndex

Verify ↗