How to beginner · 3 min read

How to call Azure OpenAI with LiteLLM

Quick answer
Use the litellm Python package to call Azure OpenAI by configuring the Azure endpoint and API key in environment variables, then instantiate a litellm.Client with provider="azure_openai". This lets you send prompts to Azure-hosted OpenAI models easily with client.chat.completions.create().

PREREQUISITES

  • Python 3.8+
  • Azure OpenAI resource with endpoint and API key
  • pip install litellm
  • Set environment variables AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT

Setup

Install the litellm package and set your Azure OpenAI credentials as environment variables.

  • Install LiteLLM: pip install litellm
  • Set environment variables:
    • AZURE_OPENAI_API_KEY - your Azure OpenAI API key
    • AZURE_OPENAI_ENDPOINT - your Azure OpenAI endpoint URL (e.g., https://your-resource.openai.azure.com/)
bash
pip install litellm

Step by step

Use the following Python code to call an Azure OpenAI model with LiteLLM. It creates a client configured for Azure OpenAI, sends a chat completion request, and prints the response.

python
import os
from litellm import Client

# Ensure environment variables are set
api_key = os.environ["AZURE_OPENAI_API_KEY"]
endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]

client = Client(
    provider="azure_openai",
    api_key=api_key,
    azure_openai_base=endpoint
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from LiteLLM on Azure OpenAI!"}]
)

print("Response:", response.choices[0].message.content)
output
Response: Hello from LiteLLM on Azure OpenAI!

Common variations

You can customize your calls by:

  • Using different models like gpt-4o-mini or gpt-4o.
  • Sending multiple messages for multi-turn conversations.
  • Adjusting parameters like max_tokens or temperature.
  • Using async calls with await client.chat.completions.acreate(...) if your environment supports async.
python
import asyncio

async def async_call():
    client = Client(
        provider="azure_openai",
        api_key=os.environ["AZURE_OPENAI_API_KEY"],
        azure_openai_base=os.environ["AZURE_OPENAI_ENDPOINT"]
    )
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Async call with LiteLLM on Azure OpenAI."}]
    )
    print("Async response:", response.choices[0].message.content)

asyncio.run(async_call())
output
Async response: Async call with LiteLLM on Azure OpenAI.

Troubleshooting

  • If you get authentication errors, verify your AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables are correct.
  • Ensure your Azure OpenAI resource is properly provisioned and the model name matches one available in your Azure subscription.
  • For network errors, check your firewall or proxy settings.

Key Takeaways

  • Use litellm.Client with provider="azure_openai" to call Azure OpenAI models.
  • Set AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables before running your code.
  • LiteLLM supports both sync and async calls for flexible integration.
  • Verify model names and Azure resource setup to avoid authentication or usage errors.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗