How to use Claude API in python
anthropic Python SDK by initializing Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) and calling client.messages.create() with model="claude-3-5-sonnet-20241022" and your messages.Setup
pip install anthropic ANTHROPIC_API_KEY import os
import anthropic Examples
Integration steps
- Install the Anthropic Python SDK with pip.
- Set your API key in the environment variable ANTHROPIC_API_KEY.
- Import the anthropic library and initialize the client with your API key.
- Create a messages list with user prompts.
- Call client.messages.create() with the model and messages.
- Extract the response text from response.content[0].text.
Full code
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
print("Claude response:", response.content[0].text) Claude response: The capital of France is Paris.
API trace
{"model": "claude-3-5-sonnet-20241022", "max_tokens": 500, "system": "You are a helpful assistant.", "messages": [{"role": "user", "content": "What is the capital of France?"}]} {"id": "chatcmpl-xxx", "object": "chat.completion", "created": 1680000000, "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": ["The capital of France is Paris."]}}], "usage": {"prompt_tokens": 20, "completion_tokens": 10, "total_tokens": 30}} response.content[0].textVariants
Streaming response ›
Use streaming to display partial responses in real-time for better user experience with long outputs.
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
stream = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Explain AI in simple terms."}],
stream=True
)
for chunk in stream:
print(chunk.content[0].text, end='') Async version ›
Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.
import os
import asyncio
import anthropic
async def main():
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = await client.messages.acreate(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Summarize the latest AI trends."}]
)
print("Async Claude response:", response.content[0].text)
asyncio.run(main()) Use Claude 3 Opus model ›
Use the Claude 3 Opus model for creative writing tasks or when you want a slightly different style or tone.
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Generate a poem about spring."}]
)
print("Claude Opus response:", response.content[0].text) Performance
- Use concise prompts to reduce token usage.
- Limit <code>max_tokens</code> to avoid unnecessary long completions.
- Reuse context efficiently by summarizing prior conversation.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard call | ~800ms | ~$0.003 | General purpose chat completions |
| Streaming | Starts immediately, total ~800ms | ~$0.003 | Real-time UI updates for long responses |
| Async call | ~800ms | ~$0.003 | Concurrent requests in async apps |
Quick tip
Always set the <code>system</code> parameter to guide Claude’s behavior effectively for your use case.
Common mistake
Passing <code>role="system"</code> inside the messages array instead of using the <code>system=</code> parameter causes errors.