RateLimitError
openai.RateLimitError (HTTP 429)
Stack trace
openai.RateLimitError: Error code: 429 - {'error': {'message': 'Rate limit reached', 'type': 'requests', 'code': 'rate_limit_exceeded'}} Why it happens
This error happens because the OpenAI API enforces strict rate limits to prevent abuse and ensure fair usage. When your app sends too many requests in a short time, the server responds with a 429 status to throttle traffic.
Detection
Monitor API call frequency and catch openai.RateLimitError exceptions to log and alert on rate limit breaches before your app crashes.
Causes & fixes
Sending too many requests concurrently or in rapid succession exceeding OpenAI's rate limits.
Implement request throttling or exponential backoff retry logic to reduce request frequency and respect rate limits.
Using multiple API keys or clients without coordinating usage, causing aggregate rate limit breaches.
Centralize API key usage and coordinate request rates across all clients to stay within limits.
Not handling RateLimitError exceptions properly, causing repeated immediate retries and cascading failures.
Catch RateLimitError explicitly and apply a delay or backoff before retrying requests.
Code: broken vs fixed
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}]) # This may raise RateLimitError if too many requests import os
from openai import OpenAI, RateLimitError
import time
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
try:
response = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}]) # Fixed: added exception handling
print(response.choices[0].message.content)
except RateLimitError:
print("Rate limit exceeded, retrying after delay...")
time.sleep(10) # Wait before retrying
response = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}])
print(response.choices[0].message.content) Workaround
Wrap API calls in try/except RateLimitError blocks and implement a simple fixed delay retry to avoid immediate repeated failures.
Prevention
Use client-side rate limiting, exponential backoff retries, and monitor usage metrics to stay within OpenAI's rate limits and avoid 429 errors.