Using Claude via AWS Bedrock
Why this matters
AWS Bedrock lets you run Claude entirely within AWS infrastructure with IAM-based auth, VPC endpoints, and unified billing: critical for enterprises with security compliance requirements, air-gapped deployments, or existing AWS commitments.
Explanation
What it does: AWS Bedrock is a managed service that hosts Claude and other foundation models. Instead of calling api.anthropic.com directly, you call the Bedrock API (bedrock-runtime.us-east-1.amazonaws.com) with your AWS credentials, and Bedrock routes your request to Claude's inference engine.
How it works: When you instantiate the Anthropic SDK with api_key pointing to a Bedrock session, the SDK translates your messages.create() calls into Bedrock's InvokeModel API. Bedrock validates your IAM role permissions, then invokes Claude in an isolated VPC environment. Response format mirrors Anthropic's API exactly: the SDK handles translation transparently. This means you write identical code whether you target Anthropic or Bedrock; only the credentials differ.
When to use it: Use Bedrock when (1) your organization requires all ML inference inside AWS for data residency, (2) you need IAM-based access control instead of API keys, (3) you want VPC endpoints to keep inference traffic off the public internet, or (4) you're standardizing on Bedrock for multiple foundation models and want unified billing.
Request code
import json
import boto3
from anthropic import Anthropic
bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')
client = Anthropic(
api_key='unused-with-bedrock',
http_client=None
)
def invoke_claude_via_bedrock(prompt: str) -> str:
payload = {
'anthropic_version': 'bedrock-2023-06-01',
'max_tokens': 1024,
'messages': [
{
'role': 'user',
'content': prompt
}
]
}
response = bedrock_client.invoke_model(
modelId='anthropic.claude-opus-4-6-20250514-v1:0',
contentType='application/json',
accept='application/json',
body=json.dumps(payload)
)
result = json.loads(response['body'].read().decode('utf-8'))
return result['content'][0]['text']
answer = invoke_claude_via_bedrock('Explain why serverless architectures require careful timeout tuning.')
print(answer) Authentication
1. Create an AWS IAM user or role with bedrock:InvokeModel permissions: attach policy arn:aws:iam::aws:policy/AmazonBedrockFullAccess (or scope to specific model ARNs). 2. Configure AWS credentials via ~/.aws/credentials, environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), or EC2 instance role. 3. In your Python environment, ensure boto3 is installed and credentials are loaded before instantiating the Anthropic client. 4. Verify Bedrock is enabled in your AWS region (us-east-1, us-west-2, eu-west-1 as of April 2026). 5. Test access: aws bedrock list-foundation-models --region us-east-1 should list Claude models without error.
Response shape
| Field | Description |
|---|---|
content | List of content blocks, first item contains the model's text response |
content[0].type | "text": always text for message responses |
content[0].text | The actual model output string |
stop_reason | "end_turn" | "max_tokens": why generation stopped |
usage.input_tokens | Number of tokens in your prompt |
usage.output_tokens | Number of tokens in Claude's response |
Field guide
stop_reason If stop_reason is "max_tokens", your response was truncated: increase max_tokens if you need the full answer
usage Bedrock charges per token (input + output). Always log usage.input_tokens and usage.output_tokens to track spend; Bedrock bills differ from direct Anthropic pricing
Setup trap
AWS credential loading is silent: if you have AWS_PROFILE set to a role without bedrock:InvokeModel permissions, the boto3 client will initialize successfully but fail at invoke time with an opaque AccessDenied error. Always test with aws sts get-caller-identity and aws bedrock list-foundation-models before running your code. Also, Bedrock model IDs differ from Anthropic's (e.g., 'anthropic.claude-opus-4-6-20250514-v1:0' vs. 'claude-opus-4-6'); check the AWS console or boto3 list_foundation_models() to confirm the exact ID for your region.
Cost
Bedrock charges $0.003 per 1K input tokens and $0.015 per 1K output tokens for Claude Opus 4.6 (as of April 2026): about 20% cheaper than direct Anthropic API for high-volume inference. However, on-demand Bedrock incurs a minimum charge; consider provisioned throughput if you exceed ~500K tokens/day. Also, data transferred out of AWS to your application counts toward AWS data transfer charges (~$0.02/GB), so running inference and post-processing within EC2 or Lambda avoids egress costs.
Rate limits
Bedrock enforces per-model rate limits based on your account's provisioned throughput. Default on-demand limits are generous (~100 requests/second), but production workloads should request higher limits via AWS Service Quotas console. If you hit rate limits, boto3 does not retry automatically; implement exponential backoff with jitter (boto3's config Retries.max_attempts only applies to transient errors, not rate limits).
Common gotcha
You must construct and invoke Bedrock's InvokeModel directly with raw JSON payload: you cannot use the Anthropic SDK's client.messages.create() method with Bedrock. The SDK's standard methods only work with Anthropic's direct API. Many developers assume 'use the SDK with Bedrock credentials' will work; it won't. You need boto3 and manual JSON marshalling.
Error recovery
AccessDeniedResourceNotFoundExceptionValidationException (invalid payload)ThrottlingExceptionServiceUnavailableExceptionExperienced dev note
Bedrock's response format is identical to Anthropic's API, but the marshalling path is different: use boto3's invoke_model(), not the Anthropic SDK's client methods. This matters because it means (1) you can't easily swap between Anthropic and Bedrock in the same code path using just a config flag, and (2) you're responsible for retry logic and token tracking. Many teams avoid Bedrock for this reason, opting for direct Anthropic + VPC egress filtering instead. However, if your org mandates 'no external API calls,' Bedrock becomes mandatory: plan for the extra abstraction layer and test failover to a second region before production.
Check your understanding
Why can't you simply pass your AWS credentials to the Anthropic SDK's client.Anthropic() constructor and have it work with Bedrock? What must change in your code architecture?
Show answer hint
The Anthropic SDK (even with AWS credentials) targets Anthropic's API servers directly. Bedrock is a different API entirely (AWS's InvokeModel operation), so you must use boto3 to call it. The SDK doesn't auto-detect or route to Bedrock: you handle the HTTP layer yourself with boto3.