High severity HTTP 429 beginner · Fix: 2-5 min

RateLimitError

mistral.client.errors.RateLimitError (HTTP 429)

What this error means
Mistral API returns HTTP 429 RateLimitError when too many requests are sent in a short time exceeding the allowed quota.

Stack trace

traceback
mistral.client.errors.RateLimitError: HTTP 429 Too Many Requests: Rate limit exceeded for your API key.
QUICK FIX
Add retry with exponential backoff on catching RateLimitError to automatically handle 429 responses.

Why it happens

Mistral enforces strict rate limits on API usage to prevent abuse and ensure fair resource allocation. When your application sends requests faster than the allowed threshold, the server responds with a 429 error indicating you must slow down.

Detection

Monitor API responses for HTTP 429 status codes and log occurrences to detect when rate limits are being hit before your app crashes or degrades.

Causes & fixes

1

Sending requests too frequently without respecting Mistral's rate limits

✓ Fix

Implement exponential backoff and retry logic with delays between requests to stay within the allowed request rate.

2

Using multiple parallel threads or processes that collectively exceed the rate limit

✓ Fix

Coordinate request sending across threads/processes to throttle total request rate below the limit.

3

Not setting or using the correct API key or quota-limited key

✓ Fix

Verify your API key is valid, has sufficient quota, and is correctly set in environment variables before making requests.

Code: broken vs fixed

Broken - triggers the error
python
from mistral.client import MistralClient
import os

client = MistralClient(api_key=os.environ['MISTRAL_API_KEY'])

# This call may raise RateLimitError if too many requests are sent
response = client.generate(prompt="Hello")  # triggers RateLimitError 429
print(response)
Fixed - works correctly
python
from mistral.client import MistralClient, RateLimitError
import os
import time

client = MistralClient(api_key=os.environ['MISTRAL_API_KEY'])

max_retries = 3
for attempt in range(max_retries):
    try:
        response = client.generate(prompt="Hello")
        print(response)
        break
    except RateLimitError:
        wait_time = 2 ** attempt  # exponential backoff
        print(f"Rate limit hit, retrying in {wait_time}s...")
        time.sleep(wait_time)
else:
    print("Failed after retries due to rate limiting.")

# Added retry with exponential backoff to handle 429 errors gracefully
Added retry loop with exponential backoff on RateLimitError to handle 429 responses by waiting and retrying automatically.

Workaround

Catch RateLimitError exceptions and pause your application for a fixed interval before retrying the request to avoid immediate repeated failures.

Prevention

Implement centralized rate limiting in your client to track request counts and delays, and use Mistral's recommended request pacing guidelines to never exceed limits.

Python 3.9+ · mistral-client >=1.0.0 · tested on 1.2.0
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.