API Advanced hard · 8 min

Using Claude via AWS Bedrock

What you will learn

Access Claude models through AWS Bedrock's managed inference service instead of Anthropic's direct API, enabling native AWS IAM authentication and VPC isolation.

Why this matters

AWS Bedrock lets you run Claude entirely within AWS infrastructure with IAM-based auth, VPC endpoints, and unified billing: critical for enterprises with security compliance requirements, air-gapped deployments, or existing AWS commitments.

Skip if: Use Anthropic's direct API if you need the latest Claude versions immediately (Bedrock lags ~2 weeks behind Anthropic releases), operate outside AWS, or have no AWS infrastructure or compliance requirements.

Explanation

What it does: AWS Bedrock is a managed service that hosts Claude and other foundation models. Instead of calling api.anthropic.com directly, you call the Bedrock API (bedrock-runtime.us-east-1.amazonaws.com) with your AWS credentials, and Bedrock routes your request to Claude's inference engine.

How it works: When you instantiate the Anthropic SDK with api_key pointing to a Bedrock session, the SDK translates your messages.create() calls into Bedrock's InvokeModel API. Bedrock validates your IAM role permissions, then invokes Claude in an isolated VPC environment. Response format mirrors Anthropic's API exactly: the SDK handles translation transparently. This means you write identical code whether you target Anthropic or Bedrock; only the credentials differ.

When to use it: Use Bedrock when (1) your organization requires all ML inference inside AWS for data residency, (2) you need IAM-based access control instead of API keys, (3) you want VPC endpoints to keep inference traffic off the public internet, or (4) you're standardizing on Bedrock for multiple foundation models and want unified billing.

Request code

python

import json
import boto3
from anthropic import Anthropic

bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')

client = Anthropic(
    api_key='unused-with-bedrock',
    http_client=None
)

def invoke_claude_via_bedrock(prompt: str) -> str:
    payload = {
        'anthropic_version': 'bedrock-2023-06-01',
        'max_tokens': 1024,
        'messages': [
            {
                'role': 'user',
                'content': prompt
            }
        ]
    }
    
    response = bedrock_client.invoke_model(
        modelId='anthropic.claude-opus-4-6-20250514-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps(payload)
    )
    
    result = json.loads(response['body'].read().decode('utf-8'))
    return result['content'][0]['text']

answer = invoke_claude_via_bedrock('Explain why serverless architectures require careful timeout tuning.')
print(answer)

Authentication

1. Create an AWS IAM user or role with bedrock:InvokeModel permissions: attach policy arn:aws:iam::aws:policy/AmazonBedrockFullAccess (or scope to specific model ARNs). 2. Configure AWS credentials via ~/.aws/credentials, environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), or EC2 instance role. 3. In your Python environment, ensure boto3 is installed and credentials are loaded before instantiating the Anthropic client. 4. Verify Bedrock is enabled in your AWS region (us-east-1, us-west-2, eu-west-1 as of April 2026). 5. Test access: aws bedrock list-foundation-models --region us-east-1 should list Claude models without error.

Response shape

Field	Description
`content`	List of content blocks, first item contains the model's text response
`content[0].type`	"text": always text for message responses
`content[0].text`	The actual model output string
`stop_reason`	"end_turn" \| "max_tokens": why generation stopped
`usage.input_tokens`	Number of tokens in your prompt
`usage.output_tokens`	Number of tokens in Claude's response

Field guide

stop_reason

If stop_reason is "max_tokens", your response was truncated: increase max_tokens if you need the full answer

usage

Bedrock charges per token (input + output). Always log usage.input_tokens and usage.output_tokens to track spend; Bedrock bills differ from direct Anthropic pricing

Setup trap

AWS credential loading is silent: if you have AWS_PROFILE set to a role without bedrock:InvokeModel permissions, the boto3 client will initialize successfully but fail at invoke time with an opaque AccessDenied error. Always test with aws sts get-caller-identity and aws bedrock list-foundation-models before running your code. Also, Bedrock model IDs differ from Anthropic's (e.g., 'anthropic.claude-opus-4-6-20250514-v1:0' vs. 'claude-opus-4-6'); check the AWS console or boto3 list_foundation_models() to confirm the exact ID for your region.

Cost

Bedrock charges $0.003 per 1K input tokens and $0.015 per 1K output tokens for Claude Opus 4.6 (as of April 2026): about 20% cheaper than direct Anthropic API for high-volume inference. However, on-demand Bedrock incurs a minimum charge; consider provisioned throughput if you exceed ~500K tokens/day. Also, data transferred out of AWS to your application counts toward AWS data transfer charges (~$0.02/GB), so running inference and post-processing within EC2 or Lambda avoids egress costs.

Rate limits

Bedrock enforces per-model rate limits based on your account's provisioned throughput. Default on-demand limits are generous (~100 requests/second), but production workloads should request higher limits via AWS Service Quotas console. If you hit rate limits, boto3 does not retry automatically; implement exponential backoff with jitter (boto3's config Retries.max_attempts only applies to transient errors, not rate limits).

Common gotcha

You must construct and invoke Bedrock's InvokeModel directly with raw JSON payload: you cannot use the Anthropic SDK's client.messages.create() method with Bedrock. The SDK's standard methods only work with Anthropic's direct API. Many developers assume 'use the SDK with Bedrock credentials' will work; it won't. You need boto3 and manual JSON marshalling.

Error recovery

AccessDenied

Your AWS role lacks bedrock:InvokeModel permission. Run aws iam get-user-policy and verify the attached policy includes bedrock:InvokeModel for the resource ARN. If using a cross-account role, ensure trust relationship allows your principal.

ResourceNotFoundException

The model ID doesn't exist in your region. Call bedrock_client.list_foundation_models() to see available model IDs (e.g., anthropic.claude-3-5-sonnet-20241022-v2:0). Model IDs are region-specific and change monthly.

ValidationException (invalid payload)

Your JSON payload doesn't match Bedrock's schema for Anthropic models. Ensure 'anthropic_version': 'bedrock-2023-06-01' is present, 'max_tokens' is an integer ≥ 1, and 'messages' is a non-empty list with 'role' and 'content' keys.

ThrottlingException

You've exceeded rate limits or provisioned throughput. Implement exponential backoff: wait 2^attempt seconds before retrying. If persistent, increase provisioned throughput via AWS console or request on-demand quota increase.

ServiceUnavailableException

Bedrock is temporarily unavailable in your region. Retry with exponential backoff or switch to a different region if your use case permits.

Experienced dev note

Bedrock's response format is identical to Anthropic's API, but the marshalling path is different: use boto3's invoke_model(), not the Anthropic SDK's client methods. This matters because it means (1) you can't easily swap between Anthropic and Bedrock in the same code path using just a config flag, and (2) you're responsible for retry logic and token tracking. Many teams avoid Bedrock for this reason, opting for direct Anthropic + VPC egress filtering instead. However, if your org mandates 'no external API calls,' Bedrock becomes mandatory: plan for the extra abstraction layer and test failover to a second region before production.

Check your understanding

Why can't you simply pass your AWS credentials to the Anthropic SDK's client.Anthropic() constructor and have it work with Bedrock? What must change in your code architecture?

Show answer hint

The Anthropic SDK (even with AWS credentials) targets Anthropic's API servers directly. Bedrock is a different API entirely (AWS's InvokeModel operation), so you must use boto3 to call it. The SDK doesn't auto-detect or route to Bedrock: you handle the HTTP layer yourself with boto3.

VERSION anthropic 0.94.x SDK does not have native Bedrock support (that was removed after 0.27.x). Use boto3 >=1.28.0 to invoke Bedrock directly. Claude model IDs on Bedrock (e.g., 'anthropic.claude-opus-4-6-20250514-v1:0') lag Anthropic's public releases by ~2 weeks. Check AWS Bedrock console for the latest available model ID in your region before deploying.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.