LiteLLM error codes reference
RateLimitError, ConnectionError, or InvalidRequestError during API calls. Handling these requires catching exceptions from the LiteLLM client and implementing retries or correcting request parameters.api_error RateLimitError automatically.Why this happens
LiteLLM error codes typically arise from API request issues such as exceeding rate limits, malformed requests, or network failures. For example, a RateLimitError occurs when too many requests are sent in a short time, triggering a 429 HTTP response. A ConnectionError happens if the client cannot reach the LiteLLM server due to network problems. An InvalidRequestError indicates incorrect parameters or missing required fields in the request payload.
Example broken code triggering RateLimitError:
from litellm import LiteLLMClient
import os
client = LiteLLMClient(api_key=os.environ["LITELLM_API_KEY"])
# Rapid loop causing rate limit error
for _ in range(100):
response = client.chat_completion(model="litellm-v1", messages=[{"role": "user", "content": "Hello"}])
print(response.text) litellm.errors.RateLimitError: Too many requests, please slow down.
The fix
Implementing retry logic with exponential backoff prevents immediate repeated requests that cause RateLimitError. Also, validate request parameters to avoid InvalidRequestError. Catching exceptions allows graceful error handling and fallback.
Corrected code with retry and error handling:
from litellm import LiteLLMClient, errors
import os
import time
client = LiteLLMClient(api_key=os.environ["LITELLM_API_KEY"])
max_retries = 5
retry_delay = 1 # seconds
for _ in range(100):
for attempt in range(max_retries):
try:
response = client.chat_completion(model="litellm-v1", messages=[{"role": "user", "content": "Hello"}])
print(response.text)
break
except errors.RateLimitError:
time.sleep(retry_delay)
retry_delay *= 2 # exponential backoff
except errors.InvalidRequestError as e:
print(f"Invalid request: {e}")
break
except errors.ConnectionError as e:
print(f"Connection failed: {e}")
time.sleep(retry_delay)
retry_delay *= 2 Hello Hello ... (repeated without errors)
Preventing it in production
Use robust retry mechanisms with capped exponential backoff and jitter to avoid synchronized retries. Validate all request inputs before sending. Monitor API usage to stay within rate limits. Implement circuit breakers to degrade gracefully if the API is unavailable. Logging errors helps diagnose persistent issues.
Key Takeaways
- Always catch LiteLLM API exceptions to handle errors gracefully.
- Implement exponential backoff retries to manage rate limits effectively.
- Validate request parameters to prevent invalid request errors.