Code beginner · 4 min read

How to build a chatbot with Claude API in python

Q: How to build a chatbot with Claude API in python

Use the anthropic Python SDK to create a client with your API key, then call client.messages.create with the claude-3-5-sonnet-20241022 model and a messages array to build a chatbot.

Direct answer

Use the anthropic Python SDK to create a client with your API key, then call client.messages.create with the claude-3-5-sonnet-20241022 model and a messages array to build a chatbot.

Setup

Install

bash

pip install anthropic

Env vars

ANTHROPIC_API_KEY

Imports

python

import os
import anthropic

Examples

inHello, who won the 2024 US presidential election?

outThe 2024 US presidential election was won by Joe Biden.

inCan you help me write a Python function to reverse a string?

outSure! Here's a Python function to reverse a string: ```python def reverse_string(s): return s[::-1] ```

inTell me a joke about computers.

outWhy do programmers prefer dark mode? Because light attracts bugs!

Integration steps

Install the anthropic Python SDK and set your API key in the environment variable ANTHROPIC_API_KEY.
Import the anthropic library and initialize the client with your API key from os.environ.
Create a messages list with user input as a dictionary with role 'user' and content as the message text.
Call client.messages.create with the model 'claude-3-5-sonnet-20241022', system prompt, max_tokens, and the messages list.
Extract the chatbot's reply from the response's content field and display it.

Full code

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = "You are a helpful assistant chatbot."

messages = [
    {"role": "user", "content": "Hello! Can you tell me a fun fact about space?"}
]

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system=system_prompt,
    messages=messages
)

print("Chatbot reply:")
print(response.content[0].text)

output

Chatbot reply:
Sure! Did you know that a day on Venus is longer than a year on Venus? It takes Venus about 243 Earth days to rotate once, but only about 225 Earth days to orbit the Sun.

API trace

Request

json

{"model": "claude-3-5-sonnet-20241022", "max_tokens": 300, "system": "You are a helpful assistant chatbot.", "messages": [{"role": "user", "content": "Hello! Can you tell me a fun fact about space?"}]}

Response

json

{"id": "chatcmpl-xxx", "object": "chat.completion", "created": 1680000000, "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": ["Sure! Did you know that a day on Venus is longer than a year on Venus? It takes Venus about 243 Earth days to rotate once, but only about 225 Earth days to orbit the Sun."]}, "finish_reason": "stop"}], "usage": {"prompt_tokens": 20, "completion_tokens": 50, "total_tokens": 70}}

Extractresponse.content[0].text

Variants

Streaming Chatbot with Claude API ›

Use streaming to display chatbot responses token-by-token for better user experience in interactive applications.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = "You are a helpful assistant chatbot."

messages = [{"role": "user", "content": "Tell me a story about a robot."}]

for chunk in client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system=system_prompt,
    messages=messages
):
    print(chunk.content[0].text, end='', flush=True)

print()

Async Chatbot with Claude API ›

Use async calls when integrating the chatbot in asynchronous Python applications or web servers.

python

import os
import asyncio
import anthropic

async def main():
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
    system_prompt = "You are a helpful assistant chatbot."
    messages = [{"role": "user", "content": "Explain quantum computing in simple terms."}]
    response = await client.messages.acreate(
        model="claude-3-5-sonnet-20241022",
        max_tokens=300,
        system=system_prompt,
        messages=messages
    )
    print("Chatbot reply:", response.content[0].text)

asyncio.run(main())

Use Claude 3 Opus Model for Cost Efficiency ›

Use the Claude 3 Opus model for faster responses and lower cost when ultra-high accuracy is not critical.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = "You are a helpful assistant chatbot."

messages = [{"role": "user", "content": "Summarize the latest tech news."}]

response = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=300,
    system=system_prompt,
    messages=messages
)

print("Chatbot reply:", response.content[0].text)

Performance

Latency~800ms for typical 300-token response on claude-3-5-sonnet-20241022

Cost~$0.003 per 300 tokens on claude-3-5-sonnet-20241022

Rate limitsTier 1: 300 requests per minute / 20,000 tokens per minute

Keep system prompts concise to reduce token usage.
Limit max_tokens to control response length and cost.
Reuse conversation context selectively to avoid token bloat.

Approach	Latency	Cost/call	Best for
Standard Chatbot (claude-3-5-sonnet-20241022)	~800ms	~$0.003	High-quality, general chatbot
Streaming Chatbot	~800ms initial + streaming	~$0.003	Interactive apps needing token-by-token output
Async Chatbot	~800ms async	~$0.003	Concurrent or async Python apps
Claude 3 Opus Model	~600ms	~$0.0015	Cost-sensitive or faster responses

✓

Quick tip

Always provide a clear system prompt with <code>system=</code> to guide Claude's chatbot behavior effectively.

⚠

Common mistake

Beginners often forget to use the <code>system=</code> parameter and instead try to include system instructions as a message, which Claude API does not support.

Verified 2026-04 · claude-3-5-sonnet-20241022, claude-3-opus-20240229

Verify ↗