Code beginner · 3 min read

How to use Claude API in python

Q: How to use Claude API in python

Use the anthropic Python SDK by initializing Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) and calling client.messages.create() with model="claude-3-5-sonnet-20241022" and your messages.

Direct answer

Use the anthropic Python SDK by initializing Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) and calling client.messages.create() with model="claude-3-5-sonnet-20241022" and your messages.

Setup

Install

bash

pip install anthropic

Env vars

ANTHROPIC_API_KEY

Imports

python

import os
import anthropic

Examples

inWhat is the capital of France?

outThe capital of France is Paris.

inWrite a Python function to reverse a string.

outHere is a Python function to reverse a string: ```python def reverse_string(s): return s[::-1] ```

inExplain quantum computing in simple terms.

outQuantum computing uses quantum bits that can be in multiple states at once, allowing it to solve certain problems faster than classical computers.

Integration steps

Install the Anthropic Python SDK with pip.
Set your API key in the environment variable ANTHROPIC_API_KEY.
Import the anthropic library and initialize the client with your API key.
Create a messages list with user prompts.
Call client.messages.create() with the model and messages.
Extract the response text from response.content[0].text.

Full code

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=500,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)

print("Claude response:", response.content[0].text)

output

Claude response: The capital of France is Paris.

API trace

Request

json

{"model": "claude-3-5-sonnet-20241022", "max_tokens": 500, "system": "You are a helpful assistant.", "messages": [{"role": "user", "content": "What is the capital of France?"}]}

Response

json

{"id": "chatcmpl-xxx", "object": "chat.completion", "created": 1680000000, "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": ["The capital of France is Paris."]}}], "usage": {"prompt_tokens": 20, "completion_tokens": 10, "total_tokens": 30}}

Extractresponse.content[0].text

Variants

Streaming response ›

Use streaming to display partial responses in real-time for better user experience with long outputs.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

stream = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=500,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Explain AI in simple terms."}],
    stream=True
)

for chunk in stream:
    print(chunk.content[0].text, end='')

Async version ›

Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.

python

import os
import asyncio
import anthropic

async def main():
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
    response = await client.messages.acreate(
        model="claude-3-5-sonnet-20241022",
        max_tokens=500,
        system="You are a helpful assistant.",
        messages=[{"role": "user", "content": "Summarize the latest AI trends."}]
    )
    print("Async Claude response:", response.content[0].text)

asyncio.run(main())

Use Claude 3 Opus model ›

Use the Claude 3 Opus model for creative writing tasks or when you want a slightly different style or tone.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=500,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Generate a poem about spring."}]
)

print("Claude Opus response:", response.content[0].text)

Performance

Latency~800ms for typical 500-token completion on claude-3-5-sonnet-20241022

Cost~$0.003 per 500 tokens

Rate limitsTier 1: 300 requests per minute / 20,000 tokens per minute

Use concise prompts to reduce token usage.
Limit <code>max_tokens</code> to avoid unnecessary long completions.
Reuse context efficiently by summarizing prior conversation.

Approach	Latency	Cost/call	Best for
Standard call	~800ms	~$0.003	General purpose chat completions
Streaming	Starts immediately, total ~800ms	~$0.003	Real-time UI updates for long responses
Async call	~800ms	~$0.003	Concurrent requests in async apps

✓

Quick tip

Always set the <code>system</code> parameter to guide Claude’s behavior effectively for your use case.

⚠

Common mistake

Passing <code>role="system"</code> inside the messages array instead of using the <code>system=</code> parameter causes errors.

Verified 2026-04 · claude-3-5-sonnet-20241022, claude-3-opus-20240229

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.