API Intermediate medium · 6 min

Polling and retrieving results

What you will learn

Use the Anthropic API to check the status of asynchronous requests and retrieve their results without blocking.

Why this matters

Long-running requests (batch processing, file analysis, or complex reasoning) tie up connections if you wait synchronously. Polling lets you submit work, do other things, and check back later: essential for production systems handling multiple concurrent tasks.

Skip if: If you need results immediately and latency is acceptable (under 30 seconds), use synchronous <code>client.messages.create()</code> instead. Polling adds complexity and is unnecessary for simple, fast requests.

Explanation

The Anthropic API supports asynchronous request patterns through batch processing and file handling. When you submit certain types of requests: like processing large documents or running analyses: the API returns immediately with a request ID rather than waiting for completion. Polling is the pattern of periodically checking the status of that request until it completes.

Under the hood, your request gets queued on Anthropic's servers. Each poll is a separate HTTP GET request that checks if the work is done. The API tracks status (submitted, processing, completed, failed) and returns the full result only when ready. This is different from webhooks (push notification when done) and requires you to manage the polling loop yourself: deciding frequency, backoff strategy, and timeout handling.

Use polling when: (1) You're processing files or batches that take minutes to hours, (2) You need to handle multiple requests concurrently without threads, or (3) You want to implement custom retry/timeout logic. Don't poll constantly: use exponential backoff or implement proper wait strategies to avoid rate limits.

Request code

python

from anthropic import Anthropic
import time
import os

client = Anthropic(api_key=os.environ.get('ANTHROPIC_API_KEY'))

with open('document.txt', 'r') as f:
    document_content = f.read()

response = client.messages.create(
    model='claude-opus-4-6',
    max_tokens=1024,
    messages=[
        {
            'role': 'user',
            'content': f'Analyze this document and extract key themes:\n\n{document_content}'
        }
    ]
)

request_id = response.id
print(f'Request submitted with ID: {request_id}')

max_wait = 300
start_time = time.time()
poll_interval = 2

while time.time() - start_time < max_wait:
    status_response = client.messages.retrieve(request_id)
    
    if status_response.status == 'completed':
        print(f'Request completed.')
        print(f'Result: {status_response.content[0].text}')
        break
    elif status_response.status == 'failed':
        print(f'Request failed: {status_response.error}')
        break
    else:
        print(f'Status: {status_response.status}, retrying in {poll_interval}s...')
        time.sleep(poll_interval)
        poll_interval = min(poll_interval * 1.5, 30)
else:
    print(f'Request timed out after {max_wait} seconds')

Authentication

Set your Anthropic API key before instantiating the client: export ANTHROPIC_API_KEY='sk-ant-...' Or pass it directly: from anthropic import Anthropic client = Anthropic(api_key='sk-ant-...') The SDK reads ANTHROPIC_API_KEY from environment variables at instantiation time.

Response shape

Field	Description
`id`	string: unique identifier for this request
`status`	string: one of 'submitted', 'processing', 'completed', or 'failed'
`content`	list: if completed, contains the message content; empty until done
`error`	object: present only if status is 'failed'; contains 'type' and 'message'
`created_at`	string (ISO 8601): timestamp when request was submitted

Field guide

status

The only field you need to check. Poll until it's 'completed' or 'failed'. Intermediate statuses are 'submitted' (queued) and 'processing' (running).

content

The actual result. Only populated after status=='completed'. Contains the same structure as a synchronous response: check content[0].text for the assistant's reply.

error

Often overlooked: if status is 'failed', this object tells you why. Don't just assume failure means 'try again'; some errors (invalid input) won't resolve with retries.

Setup trap

The client.messages.retrieve(request_id) method only exists if your request was submitted with asynchronous semantics. If you call it on a synchronous message ID, you'll get an error. Make sure you're using the async-friendly endpoints or batch submission: not all request types support polling.

Cost

Each polling call is a separate API request and counts toward your usage limits and costs. If you poll every second for 5 minutes, that's 300 requests. Use exponential backoff and set reasonable timeouts to minimize wasted calls. A request that costs 100 tokens and you poll 1,000 times costs as much as 100,000 tokens in polling overhead alone.

Rate limits

Rapid polling (more than once per second) will trigger rate limits (429 Too Many Requests). If you're polling many requests concurrently, stagger the polls and use exponential backoff. Implement jitter (random delay) to avoid thundering herd problems if multiple workers poll simultaneously.

Common gotcha

Polling too frequently will hit rate limits. Many developers write a tight loop while True: check_status() without backoff, then wonder why they get 429 errors. Always implement exponential backoff starting at 1–2 seconds and capping at 30 seconds between polls.

Error recovery

APIError with 429

Rate limit hit. Increase poll_interval, implement exponential backoff, and add random jitter. Example: poll_interval = min(poll_interval * (1.5 + random.random()), 60)

APIError with 404

Request ID not found or expired. Requests may expire after 24–72 hours; don't store request IDs indefinitely. Resubmit the request.

APIConnectionError

Network timeout or unreachable endpoint. This is transient: add try/except around retrieve() calls and retry with backoff.

APIError with 'invalid_request_id'

The request_id you retrieved was malformed or from a different API service. Verify you're polling the same client/account that submitted the original request.

Experienced dev note

Don't implement polling yourself for high-scale systems. Use a background job queue (Celery, AWS SQS, Google Cloud Tasks) with a worker that polls and stores results in a database. Polling in a web request handler blocks threads and wastes resources. Also: always set a timeout and log request IDs: you'll need them to debug stuck jobs in production.

Check your understanding

You submit a request at 2:00 PM and get back request_id='abc123'. Your polling loop checks status every 2 seconds and sees 'processing' at 2:00:30 PM, then 2:00:32 PM. At 2:05 PM, status is still 'processing'. Should you keep polling indefinitely, and what's the actual risk if you don't set a timeout?

Show answer hint

Set a hard timeout (not infinite polling). The real risk isn't that the request hangs forever: it's that something silently failed without returning 'failed' status, and you're wasting quota and compute polling a ghost. In production, log the request ID and alert after 5–10 minutes of 'processing' status.

VERSION In anthropic 0.94.x (April 2026), asynchronous request patterns are available via the standard client.messages.create() and client.messages.retrieve() methods. Earlier versions used separate batch endpoints. Always use messages.retrieve(), never deprecated batch APIs. Batch file upload and processing via Files API uses the same polling pattern: check files.wait_for_processing().

Community Notes

No notes yetBe the first to share a version-specific fix or tip.