How to beginner · 3 min read

How to retry Claude API calls with backoff

Quick answer
Use Python's retry logic with exponential backoff by catching exceptions from anthropic.Anthropic API calls and retrying after increasing delays. Wrap your client.messages.create() call in a loop with time.sleep() and exponential delay increments to handle transient failures gracefully.

PREREQUISITES

  • Python 3.8+
  • Anthropic API key
  • pip install anthropic>=0.20

Setup

Install the anthropic Python SDK and set your API key as an environment variable.

  • Install SDK: pip install anthropic
  • Set environment variable: export ANTHROPIC_API_KEY='your_api_key' (Linux/macOS) or setx ANTHROPIC_API_KEY "your_api_key" (Windows)
bash
pip install anthropic

Step by step

This example demonstrates retrying client.messages.create() calls with exponential backoff on exceptions. It retries up to 5 times with delays doubling each attempt.

python
import os
import time
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

max_retries = 5
base_delay = 1  # seconds

for attempt in range(1, max_retries + 1):
    try:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=500,
            system="You are a helpful assistant.",
            messages=[{"role": "user", "content": "Hello, retry with backoff example."}]
        )
        print("Response:", response.content[0].text)
        break  # Success, exit loop
    except Exception as e:
        print(f"Attempt {attempt} failed: {e}")
        if attempt == max_retries:
            print("Max retries reached. Exiting.")
            raise
        sleep_time = base_delay * (2 ** (attempt - 1))
        print(f"Retrying in {sleep_time} seconds...")
        time.sleep(sleep_time)
output
Response: Hello, retry with backoff example.

Common variations

You can customize retry logic by:

  • Using jitter to randomize backoff delays and reduce thundering herd issues.
  • Implementing async retries with asyncio.sleep() if using an async Anthropic client.
  • Adjusting max_retries and base_delay based on your app's tolerance.
  • Handling specific exceptions like anthropic.errors.AnthropicAPIError for more granular control.
python
import random

max_retries = 5
base_delay = 1

for attempt in range(1, max_retries + 1):
    try:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=500,
            system="You are a helpful assistant.",
            messages=[{"role": "user", "content": "Hello with jitter backoff."}]
        )
        print("Response:", response.content[0].text)
        break
    except Exception as e:
        print(f"Attempt {attempt} failed: {e}")
        if attempt == max_retries:
            raise
        # Add jitter: randomize sleep time between 0 and base_delay * 2^(attempt-1)
        sleep_time = random.uniform(0, base_delay * (2 ** (attempt - 1)))
        print(f"Retrying in {sleep_time:.2f} seconds...")
        time.sleep(sleep_time)

Troubleshooting

If you encounter persistent failures:

  • Check your API key and environment variable setup.
  • Verify network connectivity and Anthropic service status.
  • Catch and log specific exceptions to identify rate limits or quota issues.
  • Increase max_retries or delay if you hit rate limits.

Key Takeaways

  • Use try-except blocks around client.messages.create() to catch API call failures.
  • Implement exponential backoff with increasing delays to avoid hammering the API on errors.
  • Add jitter to backoff delays to reduce simultaneous retries from multiple clients.
  • Adjust retry parameters based on your application's tolerance for latency and failure.
  • Always log errors and retry attempts for easier debugging and monitoring.
Verified 2026-04 · claude-3-5-sonnet-20241022
Verify ↗