How to handle Claude tool call results in python
Direct answer
Use the Anthropic Python SDK's
client.messages.create method to call Claude models and handle results by accessing response.content[0].text for the tool call output.Setup
Install
pip install anthropic Env vars
ANTHROPIC_API_KEY Imports
import anthropic
import os Examples
inUser asks: 'What is the capital of France?'
outResponse text: 'The capital of France is Paris.'
inUser asks: 'Summarize the latest AI trends.'
outResponse text: 'Recent AI trends include large multimodal models, improved reasoning, and efficient fine-tuning techniques.'
inUser asks: 'Translate "Hello" to Spanish.'
outResponse text: 'Hola'
Integration steps
- Initialize the Anthropic client with the API key from os.environ
- Prepare the messages list with the user prompt
- Call
client.messages.createwith the Claude model and messages - Receive the response object containing the tool call results
- Extract the text result from
response.content[0].text - Use or display the extracted result as needed
Full code
import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
user_prompt = "What is the tallest mountain in the world?"
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_prompt}]
)
result_text = response.content[0].text
print("Claude response:", result_text) output
Claude response: The tallest mountain in the world is Mount Everest, which stands at 8,848 meters (29,029 feet) above sea level.
API trace
Request
{"model": "claude-3-5-sonnet-20241022", "max_tokens": 500, "system": "You are a helpful assistant.", "messages": [{"role": "user", "content": "What is the tallest mountain in the world?"}]} Response
{"id": "chatcmpl-xxx", "object": "chat.completion", "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": [{"type": "text", "text": "The tallest mountain in the world is Mount Everest, which stands at 8,848 meters (29,029 feet) above sea level."}]} }], "usage": {"prompt_tokens": 20, "completion_tokens": 30, "total_tokens": 50}} Extract
response.content[0].textVariants
Streaming Claude Tool Call ›
Use streaming when you want to display partial results as they arrive for better user experience with long responses.
import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
user_prompt = "Explain quantum computing in simple terms."
# Streaming example
for chunk in client.messages.stream(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_prompt}]
):
print(chunk.text, end='', flush=True)
print() Async Claude Tool Call ›
Use async calls to handle multiple concurrent Claude requests efficiently in event-driven Python applications.
import anthropic
import os
import asyncio
async def async_claude_call():
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
user_prompt = "List three benefits of AI in healthcare."
response = await client.messages.acreate(
model="claude-3-5-sonnet-20241022",
max_tokens=200,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_prompt}]
)
print("Async Claude response:", response.content[0].text)
asyncio.run(async_claude_call()) Use a Smaller Claude Model for Cost Efficiency ›
Use smaller Claude models like claude-3-opus-20240229 to reduce cost and latency for simpler tasks.
import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
user_prompt = "Give me a brief summary of climate change."
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=300,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": user_prompt}]
)
print("Claude smaller model response:", response.content[0].text) Performance
Latency~700ms for typical Claude-3.5-sonnet calls (non-streaming)
Cost~$0.003 per 500 tokens for Claude-3.5-sonnet
Rate limitsTier 1: 300 RPM / 20K TPM
- Use concise prompts to reduce token usage
- Limit <code>max_tokens</code> to expected response size
- Reuse context efficiently to avoid repeating information
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard Claude call | ~700ms | ~$0.003/500 tokens | General purpose, high-quality responses |
| Streaming Claude call | Starts ~300ms, streams over time | ~$0.003/500 tokens | Long responses with better UX |
| Async Claude call | ~700ms per call, concurrent | ~$0.003/500 tokens | High throughput, concurrent requests |
| Smaller Claude model | ~400ms | ~$0.0015/500 tokens | Cost-sensitive or simpler tasks |
Quick tip
Always extract the tool call result from <code>response.content[0].text</code> when using the Anthropic SDK for Claude calls.
Common mistake
Beginners often forget to use the <code>system</code> parameter or misread the nested <code>content[0].text</code> structure, causing extraction errors.