How to use retry logic in LiteLLM
Quick answer
Use Python's standard retry mechanisms like the
retrying or tenacity libraries to wrap your LiteLLM API calls. Implement retries with exponential backoff around the LiteLLM client calls to handle transient network or server errors gracefully.PREREQUISITES
Python 3.8+pip install litellmpip install tenacity (or retrying)Basic knowledge of Python exception handling
Setup
Install litellm and a retry library like tenacity to enable retry logic in your Python environment.
pip install litellm tenacity Step by step
Wrap your LiteLLM API call with a retry decorator from tenacity to automatically retry on exceptions such as network errors or timeouts.
from litellm import LiteLLM
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
import requests
# Initialize LiteLLM client
client = LiteLLM()
# Define retry logic: retry up to 3 times with exponential backoff on network errors
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type((requests.exceptions.RequestException,))
)
def generate_text_with_retry(prompt: str) -> str:
response = client.generate(prompt)
return response.text
if __name__ == "__main__":
prompt = "Write a short poem about retry logic."
try:
result = generate_text_with_retry(prompt)
print("Generated text:\n", result)
except Exception as e:
print(f"Failed after retries: {e}") output
Generated text: Retry logic helps us try again, When networks fail or errors reign. With backoff waits and attempts three, Success will come, just wait and see.
Common variations
- Use
retryinglibrary instead oftenacityfor simpler retry needs. - Customize retry conditions to catch LiteLLM-specific exceptions if available.
- Implement async retry logic with
asyncioandtenacityfor asynchronous LiteLLM calls. - Adjust backoff parameters and max attempts based on your application's tolerance for latency and failure.
import asyncio
from litellm import LiteLLMAsync
from tenacity import AsyncRetrying, stop_after_attempt, wait_fixed
client = LiteLLMAsync()
async def generate_async_with_retry(prompt: str) -> str:
async for attempt in AsyncRetrying(stop=stop_after_attempt(3), wait=wait_fixed(2)):
with attempt:
response = await client.generate(prompt)
return response.text
async def main():
prompt = "Explain retry logic asynchronously."
try:
text = await generate_async_with_retry(prompt)
print(text)
except Exception as e:
print(f"Async retry failed: {e}")
if __name__ == "__main__":
asyncio.run(main()) output
Retry logic asynchronously means trying again after failures with delays.
Troubleshooting
- If retries do not trigger, ensure you are catching the correct exceptions (e.g., network errors).
- Check that your retry decorator or loop wraps the exact LiteLLM call that may fail.
- For persistent failures, increase max attempts or add logging to diagnose issues.
- Verify network connectivity and API key validity if errors persist.
Key Takeaways
- Use Python retry libraries like
tenacityto add robust retry logic around LiteLLM calls. - Configure retries with exponential backoff to avoid overwhelming the API during transient failures.
- Wrap only the network call to LiteLLM to precisely handle retryable exceptions.
- Async LiteLLM clients support retry logic with async-compatible retry decorators.
- Always log retry attempts and failures to monitor reliability in production.