API Beginner easy · 4 min

claude-3-5-sonnet: best for most tasks

What you will learn

Claude 3.5 Sonnet is the default production model that balances speed, cost, and quality for most AI applications.

Why this matters

Choosing the right model directly affects your application's latency, cost-per-request, and output quality. Sonnet is the sweet spot for developers who don't need the full capability of Opus but want better reasoning than Haiku.

Skip if: Use Claude 3.5 Opus for tasks requiring maximum reasoning capability (complex analysis, code generation, strategic thinking) or Claude 3.5 Haiku for latency-critical, cost-sensitive tasks (chat, summarization, simple classification) where speed matters more than nuance.

Explanation

Claude 3.5 Sonnet is Anthropic's mid-tier model released in June 2024, optimized as the default choice for production systems. It processes text faster than Opus while maintaining strong reasoning capabilities at a lower cost per token. Under the hood, Sonnet uses the same transformer architecture as other Claude models but with training optimizations that reduce latency by ~25% compared to Opus while maintaining 95% of reasoning quality for most tasks. The model excels at code generation, content analysis, customer support automation, and general-purpose AI tasks. Use Sonnet when you need predictable performance without overthinking model selection: it's the model Anthropic optimizes for first when adding new features.

Request code

python

import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ.get('ANTHROPIC_API_KEY'))

message = client.messages.create(
    model='claude-3-5-sonnet-20241022',
    max_tokens=1024,
    messages=[
        {
            'role': 'user',
            'content': 'Explain how quantum entanglement works in three sentences.'
        }
    ]
)

print(f'Model: {message.model}')
print(f'Stop reason: {message.stop_reason}')
print(f'Output: {message.content[0].text}')
print(f'Input tokens: {message.usage.input_tokens}')
print(f'Output tokens: {message.usage.output_tokens}')

Authentication

Set your API key as an environment variable before running code: export ANTHROPIC_API_KEY='sk-ant-...' (macOS/Linux) or set ANTHROPIC_API_KEY=sk-ant-... (Windows). The Anthropic SDK reads this automatically when you instantiate the client. No explicit authentication calls are needed.

Response shape

Field	Description
`id`	msg_abc123: unique message identifier
`type`	message: always 'message' for this endpoint
`role`	assistant: indicates the response is from Claude
`content`	array of content blocks, typically [{'type': 'text', 'text': 'response...'}]
`model`	claude-3-5-sonnet-20241022: exact model used (may differ from requested if deprecated)
`stop_reason`	end_turn or max_tokens: why generation stopped
`stop_sequence`	null or the sequence that triggered stop (if set in request)
`usage.input_tokens`	number of tokens in your messages
`usage.output_tokens`	number of tokens in Claude's response

Field guide

stop_reason

Always check this: 'max_tokens' means Claude was cut off mid-thought; 'end_turn' means a natural stop. Never trust incomplete responses.

model

The actual model served may differ from your request if yours is deprecated. Always log this to catch model version drift in production.

content[0].text

The actual text response. Use content[0] because content is always an array (important for vision or multi-modal future requests).

usage.output_tokens

Critical for cost tracking. Charge customers based on output tokens, not input: output tokens often exceed input tokens in long conversations.

Setup trap

The Anthropic SDK reads ANTHROPIC_API_KEY from os.environ at client instantiation time. If you set os.environ['ANTHROPIC_API_KEY'] after creating the client, it won't be picked up: reorder your code to set the env var before Anthropic() is called.

Cost

Sonnet costs $3 per 1M input tokens and $15 per 1M output tokens (April 2026 pricing). A typical 1000-token request costs ~$0.018. Budget 10-15x more for output tokens than input in your cost models because Claude often generates longer responses than the input prompt.

Rate limits

Standard tier allows 40,000 RPM (requests per minute) and 2M TPM (tokens per minute). Most developers hit TPM limits before RPM. If rate-limited, implement exponential backoff with jitter: wait 1s, 2s, 4s before retrying.

Common gotcha

Passing model='claude-3-5-sonnet' without the exact version suffix (like -20241022) will route to the latest version, which may change Anthropic's behavior. Always pin the full model ID in production code to prevent silent breaking changes.

Error recovery

AuthenticationError

Your ANTHROPIC_API_KEY is missing, empty, or malformed. Verify: (1) export was successful (echo $ANTHROPIC_API_KEY), (2) key starts with 'sk-ant-', (3) no extra spaces in the env var.

RateLimitError

You've exceeded rate limits. Implement exponential backoff with jitter. Wait before retry: random(1-3) seconds first attempt, then double each time up to 60 seconds.

InvalidRequestError with 'max_tokens'

max_tokens value is too high (max 4096 for Sonnet). Or model name is mistyped. Double-check the exact model ID 'claude-3-5-sonnet-20241022'.

APIConnectionError

Network issue or Anthropic API temporarily down. Retry with backoff. Check status.anthropic.com.

APIStatusError with 400

Message format error. Verify: messages array has role + content, role is 'user' or 'assistant' (alternating), content is a string or array of dicts with 'type' and 'text'.

Experienced dev note

Sonnet is the model Anthropic optimizes for in production. Feature rollouts land here first, and Anthropic's own reliability metrics are highest for Sonnet. This is not a second-choice model: it's the strategic choice. Also: log model version in all production responses. Silent model rollouts have caught teams off guard; detecting them requires comparing response.model against your request.

Check your understanding

Why would you get different outputs from two identical requests to Sonnet on the same day, and what should you check first?

Show answer hint

Anthropic silently rolls out newer model versions when old versions reach deprecation. Check response.model against your requested model: if they differ, you've been routed to a newer version. This is also why pinning the full model ID matters.

VERSION claude-3-5-sonnet-20241022 is the current stable version (April 2026). Earlier versions like claude-3-5-sonnet-20240620 are deprecated but still available: however, Anthropic may discontinue them. Always use the latest -YYYYMMDD suffix to stay on supported versions. Check https://docs.anthropic.com/en/docs/about/models/latest for current model IDs.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.