How to Intermediate · 3 min read

How to make concurrent Claude API calls

Quick answer
Use Python's asyncio with the Anthropic SDK's async client to make concurrent calls to Claude models. Instantiate anthropic.Anthropic with your API key, then run multiple client.messages.acreate calls concurrently using asyncio.gather for efficient parallel requests.

PREREQUISITES

  • Python 3.8+
  • Anthropic API key
  • pip install anthropic>=0.20

Setup

Install the Anthropic Python SDK and set your API key as an environment variable.

bash
pip install anthropic>=0.20

Step by step

This example demonstrates how to make multiple concurrent calls to the Claude API using asyncio and the Anthropic SDK's async client.

python
import os
import asyncio
import anthropic

async def call_claude(client, prompt):
    response = await client.messages.acreate(
        model="claude-3-5-sonnet-20241022",
        max_tokens=200,
        system="You are a helpful assistant.",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

async def main():
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

    prompts = [
        "Explain quantum computing in simple terms.",
        "Write a Python function to reverse a string.",
        "Summarize the plot of Hamlet."
    ]

    tasks = [call_claude(client, prompt) for prompt in prompts]
    results = await asyncio.gather(*tasks)

    for i, result in enumerate(results, 1):
        print(f"Response {i}:\n{result}\n")

if __name__ == "__main__":
    asyncio.run(main())
output
Response 1:
Quantum computing is a type of computation that uses quantum bits or qubits, which can be in multiple states at once, allowing complex problems to be solved faster than classical computers.

Response 2:
def reverse_string(s):
    return s[::-1]

Response 3:
Hamlet is a tragedy by Shakespeare about a prince who seeks revenge against his uncle, who murdered Hamlet's father and took the throne.

Common variations

  • Use different Claude models by changing the model parameter, e.g., claude-3-opus-20240229.
  • Adjust max_tokens or add other parameters like temperature for varied outputs.
  • For synchronous calls, use client.messages.create without asyncio, but concurrency is limited.

Troubleshooting

  • If you get authentication errors, verify your ANTHROPIC_API_KEY environment variable is set correctly.
  • For rate limit errors, reduce concurrency or add retry logic with backoff.
  • Ensure you use the async methods acreate for concurrency; synchronous calls block the event loop.

Key Takeaways

  • Use Anthropic SDK's async client with asyncio for concurrent Claude API calls.
  • Run multiple client.messages.acreate calls with asyncio.gather to maximize throughput.
  • Always set your API key in os.environ and handle rate limits gracefully.
Verified 2026-04 · claude-3-5-sonnet-20241022, claude-3-opus-20240229
Verify ↗