How to beginner · 3 min read

How to use Azure OpenAI with LlamaIndex

Q: How to use Azure OpenAI with LlamaIndex

Use the AzureOpenAI client from the openai package configured with your Azure endpoint and deployment name, then pass this client to LlamaIndex as the language model. This enables you to build and query indexes with Azure OpenAI models seamlessly.

Quick answer

Use the AzureOpenAI client from the openai package configured with your Azure endpoint and deployment name, then pass this client to LlamaIndex as the language model. This enables you to build and query indexes with Azure OpenAI models seamlessly.

PREREQUISITES

Python 3.8+
Azure OpenAI resource with deployment name
pip install openai>=1.0 llama-index>=0.6.0
Set environment variables AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT

Setup

Install the required Python packages and set environment variables for Azure OpenAI API key and endpoint.

Install packages: openai and llama-index
Set AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT in your environment

bash

pip install openai>=1.0 llama-index>=0.6.0

Step by step

This example shows how to initialize the Azure OpenAI client, create a LlamaIndex index from documents, and query it.

python

import os
from openai import AzureOpenAI
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader

# Initialize Azure OpenAI client
client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version="2024-02-01"
)

# Load documents from a directory
documents = SimpleDirectoryReader("./data").load_data()

# Create an index using Azure OpenAI as the language model
index = GPTVectorStoreIndex.from_documents(documents, llm=client)

# Query the index
query = "What is the main topic of the documents?"
response = index.query(query)
print("Answer:", response.response)

output

Answer: The main topic of the documents is ...

Common variations

Use different Azure OpenAI deployment names by setting deployment_name in AzureOpenAI constructor.
Use async calls with asyncio and llama-index async methods.
Switch to other LlamaIndex index types like GPTListIndex or GPTTreeIndex depending on your use case.

python

import os
import asyncio
from openai import AzureOpenAI
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader

async def async_query():
    client = AzureOpenAI(
        api_key=os.environ["AZURE_OPENAI_API_KEY"],
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        api_version="2024-02-01",
        deployment_name="my-azure-deployment"
    )

    documents = SimpleDirectoryReader("./data").load_data()
    index = GPTVectorStoreIndex.from_documents(documents, llm=client)

    response = await index.aquery("Summarize the documents.")
    print("Async answer:", response.response)

asyncio.run(async_query())

output

Async answer: The documents summarize ...

Troubleshooting

If you get authentication errors, verify your AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables are correct.
Ensure your Azure OpenAI deployment name matches the one used in the AzureOpenAI client.
If you see model not found errors, confirm the deployment supports the model you want to use.

✅

Key Takeaways

Use AzureOpenAI client from openai SDK with your Azure endpoint and deployment.
Pass the Azure client as the llm parameter when creating LlamaIndex indexes.
Set environment variables for secure and flexible configuration.
LlamaIndex supports both synchronous and asynchronous querying with Azure OpenAI.
Verify deployment names and API versions to avoid common errors.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗