How to beginner · 3 min read

How to call Azure OpenAI with LiteLLM

Q: How to call Azure OpenAI with LiteLLM

Use the litellm Python package to call Azure OpenAI by configuring the Azure endpoint and API key in environment variables, then instantiate a litellm.Client with provider="azure_openai". This lets you send prompts to Azure-hosted OpenAI models easily with client.chat.completions.create().

Quick answer

Use the litellm Python package to call Azure OpenAI by configuring the Azure endpoint and API key in environment variables, then instantiate a litellm.Client with provider="azure_openai". This lets you send prompts to Azure-hosted OpenAI models easily with client.chat.completions.create().

PREREQUISITES

Python 3.8+
Azure OpenAI resource with endpoint and API key
pip install litellm
Set environment variables AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT

Setup

Install the litellm package and set your Azure OpenAI credentials as environment variables.

Install LiteLLM: pip install litellm
Set environment variables:

AZURE_OPENAI_API_KEY - your Azure OpenAI API key
AZURE_OPENAI_ENDPOINT - your Azure OpenAI endpoint URL (e.g., https://your-resource.openai.azure.com/)

bash

pip install litellm

Step by step

Use the following Python code to call an Azure OpenAI model with LiteLLM. It creates a client configured for Azure OpenAI, sends a chat completion request, and prints the response.

python

import os
from litellm import Client

# Ensure environment variables are set
api_key = os.environ["AZURE_OPENAI_API_KEY"]
endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]

client = Client(
    provider="azure_openai",
    api_key=api_key,
    azure_openai_base=endpoint
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM on Azure OpenAI!"}]
)

print("Response:", response.choices[0].message.content)

output

Response: Hello from LiteLLM on Azure OpenAI!

Common variations

You can customize your calls by:

Using different models like gpt-4o-mini or gpt-4o.
Sending multiple messages for multi-turn conversations.
Adjusting parameters like max_tokens or temperature.
Using async calls with await client.chat.completions.acreate(...) if your environment supports async.

python

import asyncio

async def async_call():
    client = Client(
        provider="azure_openai",
        api_key=os.environ["AZURE_OPENAI_API_KEY"],
        azure_openai_base=os.environ["AZURE_OPENAI_ENDPOINT"]
    )
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Async call with LiteLLM on Azure OpenAI."}]
    )
    print("Async response:", response.choices[0].message.content)

asyncio.run(async_call())

output

Async response: Async call with LiteLLM on Azure OpenAI.

Troubleshooting

If you get authentication errors, verify your AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables are correct.
Ensure your Azure OpenAI resource is properly provisioned and the model name matches one available in your Azure subscription.
For network errors, check your firewall or proxy settings.

Key Takeaways

Use litellm.Client with provider="azure_openai" to call Azure OpenAI models.
Set AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables before running your code.
LiteLLM supports both sync and async calls for flexible integration.
Verify model names and Azure resource setup to avoid authentication or usage errors.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.