How to beginner · 3 min read

How to use LlamaIndex with FastAPI

Quick answer
Use LlamaIndex to build an index from your documents and expose query endpoints via FastAPI. Load documents, create an index with LlamaIndex, then serve queries asynchronously through FastAPI routes.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install llama-index fastapi uvicorn openai

Setup

Install the required packages and set your OpenAI API key as an environment variable.

bash
pip install llama-index fastapi uvicorn openai

Step by step

This example shows how to create a simple FastAPI app that uses LlamaIndex to index documents and answer queries.

python
import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from llama_index import SimpleDirectoryReader, GPTSimpleVectorIndex

app = FastAPI()

# Load documents and create index at startup
documents = SimpleDirectoryReader('docs').load_data()
index = GPTSimpleVectorIndex(documents)

class QueryRequest(BaseModel):
    query: str

@app.post('/query')
async def query_index(request: QueryRequest):
    try:
        response = index.query(request.query)
        return {'response': response.response}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# To run: uvicorn main:app --reload
output
INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Common variations

  • Use GPTVectorStoreIndex for more advanced vector indexing.
  • Load documents from other sources like PDFs or web pages.
  • Integrate async calls if using async-compatible LlamaIndex methods.
  • Switch to other OpenAI models by setting environment variables or passing model names.

Troubleshooting

  • If you get OpenAI API key missing, ensure OPENAI_API_KEY is set in your environment.
  • For ModuleNotFoundError, verify all packages are installed.
  • If queries return empty, check your document loading path and content.

Key Takeaways

  • Use LlamaIndex to build document indexes easily for FastAPI apps.
  • Load documents once at startup to optimize query performance.
  • Expose a POST endpoint in FastAPI to handle user queries asynchronously.
  • Set your OPENAI_API_KEY in environment variables to authenticate API calls.
  • Customize indexing and document loading based on your data source and use case.
Verified 2026-04 · gpt-4o, GPTSimpleVectorIndex
Verify ↗