API Beginner easy · 5 min

client.messages.create(): the core method

What you will learn

client.messages.create() is the fundamental method that sends a message to Claude and returns a response.

Why this matters

Every interaction with Claude goes through this single method. Understanding its parameters, response structure, and error modes is the foundation for building any Claude application: from simple chatbots to complex agentic systems.

Skip if: Use client.messages.stream() instead if you need real-time token-by-token responses for long-form generation. Use Batch API (Messages Batches) if you have thousands of non-urgent requests and want 50% cost savings. Do not use this for vision tasks that don't involve text: Claude only processes images embedded in message content.

Explanation

What it does: client.messages.create() sends a list of messages to Claude and returns a single Message object containing the model's response. It's a synchronous, blocking call that waits for the entire response before returning.

How it works: You provide a model (e.g., claude-opus-4-6), a max_tokens limit, and a messages list. The SDK serializes these to JSON, authenticates using your API key from the environment, sends an HTTPS POST to Anthropic's endpoint, and deserializes the response back into a Python object. The response includes the assistant's text, token usage metadata, and a stop reason (e.g., end_turn, max_tokens).

When to use it: Use this for any request where you need the full response before proceeding: question answering, summarization, code generation, or decision-making steps in a workflow. It's the simplest and most predictable way to call Claude.

Request code

python

import os
from anthropic import Anthropic

api_key = os.environ.get('ANTHROPIC_API_KEY')
if not api_key:
    raise ValueError('ANTHROPIC_API_KEY environment variable not set')

client = Anthropic(api_key=api_key)

message = client.messages.create(
    model='claude-opus-4-6',
    max_tokens=1024,
    messages=[
        {
            'role': 'user',
            'content': 'What is the capital of France?'
        }
    ]
)

print(message.content[0].text)

Authentication

Set the environment variable ANTHROPIC_API_KEY before instantiating the client. The SDK reads this at client creation time, not at request time. Export ANTHROPIC_API_KEY='sk-ant-...' in your shell or set it in a .env file and load it with python-dotenv before importing Anthropic.

Response shape

Field	Description
`id`	Unique message identifier (string, e.g., 'msg_...')
`type`	Always 'message'
`role`	Always 'assistant'
`content`	List of content blocks; for text, contains [{'type': 'text', 'text': 'response string'}]
`model`	The model that generated the response (string)
`stop_reason`	Why generation stopped: 'end_turn', 'max_tokens', or 'stop_sequence'
`stop_sequence`	The sequence that triggered stop_reason, if applicable
`usage`	Object with 'input_tokens' (int) and 'output_tokens' (int)
`created_at`	ISO 8601 timestamp when the message was created (string)

Field guide

content

A list, not a string. Always index it with [0] to access the first (usually only) text block.

usage

The field that tells you billing impact. input_tokens + output_tokens × 3 = approximate cost in USD cents for claude-opus-4-6 (varies by model).

stop_reason

Critical for workflow logic. If it's 'max_tokens', you hit your limit and the response is incomplete: you likely need to increase max_tokens or chunk input.

created_at

Developers often ignore this, but it's your proof of request timing for debugging rate-limit issues and correlating logs.

Setup trap

Setting ANTHROPIC_API_KEY in Python with os.environ['ANTHROPIC_API_KEY'] = '...' after instantiating Anthropic() will not work. The client reads the key at __init__ time. Always set the environment variable before importing or explicitly pass api_key=os.environ.get('ANTHROPIC_API_KEY') to the Anthropic constructor.

Cost

claude-opus-4-6 costs approximately $0.003 per 1K input tokens and $0.015 per 1K output tokens (April 2026 pricing). A 1,000-token input + 500-token output request costs ~$0.0105. Enable caching on system prompts longer than 1,024 tokens to reduce input costs by 90%.

Rate limits

Standard Anthropic accounts are rate-limited to 10 requests per second by default. If you exceed this, you'll receive a 429 status code. Implement exponential backoff with jitter (wait 1s, 2s, 4s, etc.) rather than retrying immediately. Batch API has no rate limits but requires asynchronous workflows.

Common gotcha

Accessing response.text instead of response.content[0].text. The response object has no .text attribute. You must access the content list and index into it, then access the .text property of that content block.

Error recovery

AuthenticationError

ANTHROPIC_API_KEY is missing, invalid, or expired. Verify the key exists in your environment with os.environ.get('ANTHROPIC_API_KEY') and confirm it starts with 'sk-ant-'. Regenerate it in the Anthropic Console if necessary.

RateLimitError

You've exceeded the requests-per-second limit (typically 10 RPS). Implement exponential backoff: catch the error, wait 2**retry_count seconds (with random jitter), then retry up to 3 times.

InvalidRequestError

Your messages list format is wrong, max_tokens exceeds the model limit, or model name is invalid. Confirm: messages=[{"role": "user", "content": "..."}], max_tokens is an integer ≤ 4096, and model is 'claude-opus-4-6' or 'claude-sonnet-4-6'.

APIConnectionError

Network connectivity issue or Anthropic API is down. Check your internet connection. If persistent, check status.anthropic.com. Implement client-side retry logic with exponential backoff.

Experienced dev note

Always inspect stop_reason in production. If it's 'max_tokens', your response is truncated and the model was cut off mid-sentence. For summarization or structured output, set max_tokens 30% higher than you think you need, then trim the response afterward. For real-time user-facing apps, switch to client.messages.stream() and yield tokens as they arrive: users will perceive 10x faster responses even if total latency is the same. Also: the messages list is immutable after the request; build it once and reuse for retries, don't modify it between attempts.

Check your understanding

Why does accessing response.text fail, and what is the correct way to extract Claude's response text from the response object?

Show answer hint

The response object structure is a Message with a content field that is a list of content blocks. You must index into that list and then access the .text property of the resulting ContentBlock.

VERSION Anthropic SDK 0.94.x (April 2026) requires model names like 'claude-opus-4-6' and 'claude-sonnet-4-6'. Older SDK versions used 'claude-3-opus' or 'claude-3-sonnet'. Do not use deprecated model strings; they will fail with InvalidRequestError. The messages.create() signature is stable and not expected to break in future 0.94.x releases.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.