How to use Claude API in python
Direct answer
Use the
anthropic Python SDK by initializing Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) and calling client.messages.create() with model="claude-3-5-sonnet-20241022" and your messages.Setup
Install
pip install anthropic Env vars
ANTHROPIC_API_KEY Imports
import os
import anthropic Examples
inWhat is the capital of France?
outThe capital of France is Paris.
inWrite a Python function to reverse a string.
outHere is a Python function to reverse a string:
```python
def reverse_string(s):
return s[::-1]
```
inExplain quantum computing in simple terms.
outQuantum computing uses quantum bits that can be in multiple states at once, allowing it to solve certain problems faster than classical computers.
Integration steps
- Install the Anthropic Python SDK with pip.
- Set your API key in the environment variable ANTHROPIC_API_KEY.
- Import the anthropic library and initialize the client with your API key.
- Create a messages list with user prompts.
- Call client.messages.create() with the model and messages.
- Extract the response text from response.content[0].text.
Full code
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
print("Claude response:", response.content[0].text) output
Claude response: The capital of France is Paris.
API trace
Request
{"model": "claude-3-5-sonnet-20241022", "max_tokens": 500, "system": "You are a helpful assistant.", "messages": [{"role": "user", "content": "What is the capital of France?"}]} Response
{"id": "chatcmpl-xxx", "object": "chat.completion", "created": 1680000000, "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": ["The capital of France is Paris."]}}], "usage": {"prompt_tokens": 20, "completion_tokens": 10, "total_tokens": 30}} Extract
response.content[0].textVariants
Streaming response ›
Use streaming to display partial responses in real-time for better user experience with long outputs.
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
stream = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Explain AI in simple terms."}],
stream=True
)
for chunk in stream:
print(chunk.content[0].text, end='') Async version ›
Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.
import os
import asyncio
import anthropic
async def main():
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = await client.messages.acreate(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Summarize the latest AI trends."}]
)
print("Async Claude response:", response.content[0].text)
asyncio.run(main()) Use Claude 3 Opus model ›
Use the Claude 3 Opus model for creative writing tasks or when you want a slightly different style or tone.
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Generate a poem about spring."}]
)
print("Claude Opus response:", response.content[0].text) Performance
Latency~800ms for typical 500-token completion on claude-3-5-sonnet-20241022
Cost~$0.003 per 500 tokens
Rate limitsTier 1: 300 requests per minute / 20,000 tokens per minute
- Use concise prompts to reduce token usage.
- Limit <code>max_tokens</code> to avoid unnecessary long completions.
- Reuse context efficiently by summarizing prior conversation.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard call | ~800ms | ~$0.003 | General purpose chat completions |
| Streaming | Starts immediately, total ~800ms | ~$0.003 | Real-time UI updates for long responses |
| Async call | ~800ms | ~$0.003 | Concurrent requests in async apps |
Quick tip
Always set the <code>system</code> parameter to guide Claude’s behavior effectively for your use case.
Common mistake
Passing <code>role="system"</code> inside the messages array instead of using the <code>system=</code> parameter causes errors.