How to beginner · 3 min read

How to use retry logic in LiteLLM

Q: How to use retry logic in LiteLLM

Use Python's standard retry mechanisms like the retrying or tenacity libraries to wrap your LiteLLM API calls. Implement retries with exponential backoff around the LiteLLM client calls to handle transient network or server errors gracefully.

Quick answer

Use Python's standard retry mechanisms like the retrying or tenacity libraries to wrap your LiteLLM API calls. Implement retries with exponential backoff around the LiteLLM client calls to handle transient network or server errors gracefully.

PREREQUISITES

Python 3.8+
pip install litellm
pip install tenacity (or retrying)
Basic knowledge of Python exception handling

Setup

Install litellm and a retry library like tenacity to enable retry logic in your Python environment.

bash

pip install litellm tenacity

Step by step

Wrap your LiteLLM API call with a retry decorator from tenacity to automatically retry on exceptions such as network errors or timeouts.

python

from litellm import LiteLLM
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
import requests

# Initialize LiteLLM client
client = LiteLLM()

# Define retry logic: retry up to 3 times with exponential backoff on network errors
@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10),
    retry=retry_if_exception_type((requests.exceptions.RequestException,))
)
def generate_text_with_retry(prompt: str) -> str:
    response = client.generate(prompt)
    return response.text

if __name__ == "__main__":
    prompt = "Write a short poem about retry logic."
    try:
        result = generate_text_with_retry(prompt)
        print("Generated text:\n", result)
    except Exception as e:
        print(f"Failed after retries: {e}")

output

Generated text:
 Retry logic helps us try again,
 When networks fail or errors reign.
 With backoff waits and attempts three,
 Success will come, just wait and see.

Common variations

Use retrying library instead of tenacity for simpler retry needs.
Customize retry conditions to catch LiteLLM-specific exceptions if available.
Implement async retry logic with asyncio and tenacity for asynchronous LiteLLM calls.
Adjust backoff parameters and max attempts based on your application's tolerance for latency and failure.

python

import asyncio
from litellm import LiteLLMAsync
from tenacity import AsyncRetrying, stop_after_attempt, wait_fixed

client = LiteLLMAsync()

async def generate_async_with_retry(prompt: str) -> str:
    async for attempt in AsyncRetrying(stop=stop_after_attempt(3), wait=wait_fixed(2)):
        with attempt:
            response = await client.generate(prompt)
            return response.text

async def main():
    prompt = "Explain retry logic asynchronously."
    try:
        text = await generate_async_with_retry(prompt)
        print(text)
    except Exception as e:
        print(f"Async retry failed: {e}")

if __name__ == "__main__":
    asyncio.run(main())

output

Retry logic asynchronously means trying again after failures with delays.

Troubleshooting

If retries do not trigger, ensure you are catching the correct exceptions (e.g., network errors).
Check that your retry decorator or loop wraps the exact LiteLLM call that may fail.
For persistent failures, increase max attempts or add logging to diagnose issues.
Verify network connectivity and API key validity if errors persist.

✅

Key Takeaways

Use Python retry libraries like tenacity to add robust retry logic around LiteLLM calls.
Configure retries with exponential backoff to avoid overwhelming the API during transient failures.
Wrap only the network call to LiteLLM to precisely handle retryable exceptions.
Async LiteLLM clients support retry logic with async-compatible retry decorators.
Always log retry attempts and failures to monitor reliability in production.

Verified 2026-04 · LiteLLM

Verify ↗