API Intermediate medium · 6 min

Submitting a batch

What you will learn

Submit multiple API requests as a single batch job to process asynchronously at a 50% discount on input tokens.

Why this matters

Batches let you process large volumes of requests cost-effectively when you don't need real-time responses: ideal for data processing pipelines, content generation at scale, or analyzing thousands of documents overnight.

Skip if: Use the synchronous API directly when you need sub-second latency, real-time interactions, or have fewer than 10-20 requests to process. Batches introduce processing delays (typically 1 minute to hours) and are not suitable for chatbots or request-response workflows.

Explanation

What it does: The Anthropic Batches API lets you submit a JSONL file containing multiple message requests, which Claude processes asynchronously in a queue. You get a 50% discount on input tokens compared to synchronous API calls, plus billing for output tokens and a small per-request fee.

How it works: You format requests as JSONL (one JSON object per line), upload the file via client.beta.files.upload(), then create a batch job with client.beta.messages.batches.create(). The API returns a batch ID immediately. You poll client.beta.messages.batches.retrieve(batch_id) to check status, and when processing completes, retrieve results via client.beta.messages.batches.results(batch_id).

When to use it: Batch processing works best for workflows where latency tolerance is minutes to hours: bulk content moderation, summarizing document libraries, generating product descriptions from templates, or running weekly analysis jobs. Each batch can contain up to 10,000 requests and typically processes within a few hours depending on queue depth.

Request code

python

import anthropic
import json
import time

client = anthropic.Anthropic()

requests = [
    {
        "custom_id": "request-1",
        "params": {
            "model": "claude-opus-4-6",
            "max_tokens": 256,
            "messages": [
                {"role": "user", "content": "Summarize this in one sentence: Machine learning is a subset of artificial intelligence focused on learning patterns from data."}
            ]
        }
    },
    {
        "custom_id": "request-2",
        "params": {
            "model": "claude-opus-4-6",
            "max_tokens": 256,
            "messages": [
                {"role": "user", "content": "Summarize this in one sentence: Deep learning uses neural networks with multiple layers to extract increasingly abstract representations."}
            ]
        }
    }
]

requests_jsonl = "\n".join(json.dumps(req) for req in requests)

with open("/tmp/batch_requests.jsonl", "w") as f:
    f.write(requests_jsonl)

with open("/tmp/batch_requests.jsonl", "rb") as f:
    file_response = client.beta.files.upload(
        file=("batch_requests.jsonl", f, "application/jsonl")
    )

file_id = file_response.id
print(f"Uploaded file: {file_id}")

batch_response = client.beta.messages.batches.create(
    model="claude-opus-4-6",
    input_file_id=file_id
)

batch_id = batch_response.id
print(f"Created batch: {batch_id}")
print(f"Status: {batch_response.processing_status}")

while True:
    batch_status = client.beta.messages.batches.retrieve(batch_id)
    print(f"Batch status: {batch_status.processing_status}")
    
    if batch_status.processing_status == "ended":
        break
    
    time.sleep(5)

results = list(client.beta.messages.batches.results(batch_id))
print(f"\nResults ({len(results)} requests):")
for result in results:
    print(f"Request {result.custom_id}: {result.result.message.content[0].text}")

Authentication

Batches require a standard Anthropic API key set as the ANTHROPIC_API_KEY environment variable. The beta Batches API endpoints are automatically available in anthropic 0.94.x and later. No additional authentication setup is needed beyond standard API key configuration.

Response shape

Field	Description
`id`	string: unique batch identifier
`type`	string: always 'batch'
`processing_status`	string: 'queued', 'processing', or 'ended'
`request_counts`	[object Object]
`output_file_id`	string: file ID containing results (present when processing_status == 'ended')
`error_file_id`	string: file ID containing error details (null if no errors)
`created_at`	string: ISO 8601 timestamp when batch was created
`expires_at`	string: ISO 8601 timestamp when batch results expire

Field guide

processing_status

Start polling when status is 'queued' or 'processing'. Stop when 'ended'. This is the field that tells you when to call results().

request_counts

Audit this before calling results(). If errored > 0, retrieve error_file_id and parse failures separately.

output_file_id

Only populated when processing_status == 'ended'. Use this ID with client.beta.files.download(output_file_id) if you need the raw file.

expires_at

Results are retained for 29 days. Schedule cleanup jobs before this timestamp or results become permanently unavailable.

Setup trap

The Batches API requires the .beta namespace in the client. If you call client.messages.batches instead of client.beta.messages.batches, you'll get an AttributeError. The beta API endpoints were added in anthropic 0.90.0+, so older pinned versions will fail silently.

Cost

Batches cost 50% of standard synchronous API rates for input tokens. Output tokens cost the same. There is an additional $0.01 per 1,000 requests fee. Example: 1,000 requests averaging 500 input tokens and 200 output tokens would cost approximately (1000 × 500 × 0.5 × per-token-rate) + (1000 × 200 × per-token-rate) + $0.01. For claude-opus-4-6 at April 2026 pricing, that's roughly 50-60% cheaper than synchronous API calls for the same workload.

Rate limits

Batches themselves don't have per-batch rate limits, but your account has a total batch token throughput limit (typically 2M tokens/minute). If you submit batches too frequently, you'll hit your account's overall rate limit. Stagger batch submissions or request higher limits from Anthropic support.

Common gotcha

Most developers poll for batch completion once, see 'processing', then move on without checking status again. Batches can take 5 minutes to hours: implement exponential backoff polling, not fixed 5-second intervals. Also, the results() method returns an iterator of individual result objects, not a single response. You must iterate through it or convert to a list to access all results.

Error recovery

APIConnectionError

Network timeout or service unavailable. The batch request was queued. Poll status after 30 seconds: the batch usually recovers automatically. You don't need to resubmit.

AuthenticationError

Invalid or expired API key. Check that ANTHROPIC_API_KEY is set and not revoked. Regenerate the key in the Anthropic console if needed.

RateLimitError

You've exceeded your batch throughput quota. Wait before submitting the next batch. Check batch_status.request_counts to ensure the previous batch is complete before resubmitting.

BadRequestError

Invalid input_file_id, malformed JSONL, or unsupported model. Verify the file exists, each line is valid JSON, and model string matches a current Claude version.

Experienced dev note

Store batch IDs and submission timestamps in a durable queue (database, message broker, or file system) before polling. If your process crashes mid-poll, you can recover the batch status later. Also: batch processing latency is not linear. A 10,000-request batch takes roughly the same time as a 1,000-request batch because both sit in the same processing queue: you're not paying a cost premium for scale beyond token pricing. This makes batches ideal for off-peak processing of large workloads.

Check your understanding

You've submitted a batch with 5,000 requests at 10 AM. At 10:30 AM, polling shows status='processing' with 2,000 succeeded, 2,500 processing, 500 errored. Should you retrieve results now? What would you do about the 500 failed requests?

Show answer hint

Results are only available when processing_status == 'ended'. Calling results() on a batch still in 'processing' will fail. For errored requests, retrieve error_file_id and parse the JSONL to identify which custom_ids failed and why, then decide whether to retry those individually or via a new batch.

VERSION Batches API is a beta feature in anthropic 0.94.x (April 2026). The API surface is stable but subject to change without a major version bump. Input and output file formats use JSONL and are versioned: verify file_line_counts in batch responses match your submission count.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.