RateLimitError
mistral.client.errors.RateLimitError (HTTP 429)
Stack trace
mistral.client.errors.RateLimitError: HTTP 429 Too Many Requests: Rate limit exceeded for your API key.
Why it happens
Mistral enforces strict rate limits on API usage to prevent abuse and ensure fair resource allocation. When your application sends requests faster than the allowed threshold, the server responds with a 429 error indicating you must slow down.
Detection
Monitor API responses for HTTP 429 status codes and log occurrences to detect when rate limits are being hit before your app crashes or degrades.
Causes & fixes
Sending requests too frequently without respecting Mistral's rate limits
Implement exponential backoff and retry logic with delays between requests to stay within the allowed request rate.
Using multiple parallel threads or processes that collectively exceed the rate limit
Coordinate request sending across threads/processes to throttle total request rate below the limit.
Not setting or using the correct API key or quota-limited key
Verify your API key is valid, has sufficient quota, and is correctly set in environment variables before making requests.
Code: broken vs fixed
from mistral.client import MistralClient
import os
client = MistralClient(api_key=os.environ['MISTRAL_API_KEY'])
# This call may raise RateLimitError if too many requests are sent
response = client.generate(prompt="Hello") # triggers RateLimitError 429
print(response) from mistral.client import MistralClient, RateLimitError
import os
import time
client = MistralClient(api_key=os.environ['MISTRAL_API_KEY'])
max_retries = 3
for attempt in range(max_retries):
try:
response = client.generate(prompt="Hello")
print(response)
break
except RateLimitError:
wait_time = 2 ** attempt # exponential backoff
print(f"Rate limit hit, retrying in {wait_time}s...")
time.sleep(wait_time)
else:
print("Failed after retries due to rate limiting.")
# Added retry with exponential backoff to handle 429 errors gracefully Workaround
Catch RateLimitError exceptions and pause your application for a fixed interval before retrying the request to avoid immediate repeated failures.
Prevention
Implement centralized rate limiting in your client to track request counts and delays, and use Mistral's recommended request pacing guidelines to never exceed limits.