High severity HTTP 429 intermediate · Fix: 5-15 min

QuotaExceededDeploymentCapacity

azure.core.exceptions.HttpResponseError: QuotaExceededDeploymentCapacity

What this error means
Azure OpenAI service returns a quota exceeded error when the deployment capacity limit for your subscription or region is reached.

Stack trace

traceback
azure.core.exceptions.HttpResponseError: (429) QuotaExceededDeploymentCapacity: The deployment capacity for your Azure OpenAI resource has been exceeded. Please reduce usage or request a quota increase.
    at azure.ai.openai._client._client._raise_for_status(response)
    at azure.ai.openai._client._client._send_request(...)
    at azure.ai.openai.OpenAIClient.get_chat_completions(...)
    ...
QUICK FIX
Request a quota increase in the Azure Portal or reduce concurrent usage to stay within your deployment capacity limits.

Why it happens

Azure OpenAI enforces strict deployment capacity limits per subscription and region to manage resource allocation. When your usage exceeds these limits, the service returns this error to prevent overconsumption. This can happen during high traffic or if your quota has not been increased after initial provisioning.

Detection

Monitor Azure OpenAI usage metrics and quota limits via the Azure Portal or Azure CLI. Set alerts on quota usage approaching capacity to catch this error before it impacts production.

Causes & fixes

1

Your Azure OpenAI resource deployment capacity quota is fully consumed by active requests or concurrent deployments.

✓ Fix

Reduce concurrent requests or scale down active deployments. Alternatively, request a quota increase from Azure support for your subscription and region.

2

Multiple applications or services share the same Azure OpenAI resource, collectively exceeding the deployment capacity.

✓ Fix

Isolate workloads by creating separate Azure OpenAI resources per application or coordinate usage to stay within quota limits.

3

Your subscription is new or default quota limits are low, insufficient for your workload demands.

✓ Fix

Submit a quota increase request through the Azure Portal under the 'Help + support' > 'New support request' > 'Quota' section.

Code: broken vs fixed

Broken - triggers the error
python
from azure.ai.openai import OpenAIClient
import os

client = OpenAIClient(os.environ['AZURE_OPENAI_ENDPOINT'], credential=os.environ['AZURE_OPENAI_KEY'])

response = client.get_chat_completions(deployment_id='my-deployment', messages=[{'role': 'user', 'content': 'Hello'}])  # This line triggers quota exceeded error
Fixed - works correctly
python
from azure.ai.openai import OpenAIClient
import os
import azure.core.exceptions

client = OpenAIClient(os.environ['AZURE_OPENAI_ENDPOINT'], credential=os.environ['AZURE_OPENAI_KEY'])

try:
    response = client.get_chat_completions(deployment_id='my-deployment', messages=[{'role': 'user', 'content': 'Hello'}])
    print(response.choices[0].message.content)
except azure.core.exceptions.HttpResponseError as e:
    if 'QuotaExceededDeploymentCapacity' in str(e):
        print('Quota exceeded: reduce usage or request quota increase.')
    else:
        raise
Added try/except to catch the QuotaExceededDeploymentCapacity error and handle it gracefully with a clear message.

Workaround

Catch the HttpResponseError exception, detect the quota exceeded message, and implement exponential backoff retries or degrade service features temporarily until capacity frees up.

Prevention

Architect your system to monitor Azure OpenAI quota usage proactively and request quota increases before hitting limits. Use separate resources for high-demand workloads to avoid shared capacity exhaustion.

Python 3.9+ · azure-ai-openai >=1.0.0 · tested on 1.1.0
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.