How to Beginner to Intermediate · 3 min read

How to enable extended thinking in Claude API

Quick answer

To enable extended thinking in the Claude API, use the claude-3-5-sonnet-20241022 or claude-opus-4 models with increased max_tokens and provide clear system instructions to encourage multi-step reasoning. You can also use the system parameter to set a thoughtful assistant persona that guides the model to think deeply before answering.

PREREQUISITES

Python 3.8+
Anthropic API key
pip install anthropic>=0.20

Setup

Install the anthropic Python SDK and set your API key as an environment variable.

Run pip install anthropic to install the SDK.
Set your API key in your environment: export ANTHROPIC_API_KEY='your_api_key' (Linux/macOS) or setx ANTHROPIC_API_KEY "your_api_key" (Windows).

bash

pip install anthropic

Step by step

Use the claude-3-5-sonnet-20241022 model with a high max_tokens limit and a system prompt that encourages extended, step-by-step thinking.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = (
    "You are a thoughtful assistant that explains your reasoning step-by-step before answering."
)

messages = [
    {"role": "user", "content": "Explain how to solve a Rubik's Cube in detail."}
]

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1500,
    system=system_prompt,
    messages=messages
)

print(response.content[0].text)

output

You can solve a Rubik's Cube by following these steps: First, understand the cube's structure... [extended detailed explanation]

Common variations

You can enable extended thinking with other Claude models like claude-opus-4 or adjust max_tokens for longer responses. Async calls are supported via the anthropic SDK's async client. Streaming is not currently supported.

python

import asyncio
import os
import anthropic

async def extended_thinking_async():
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
    system_prompt = "You are a thoughtful assistant that explains your reasoning step-by-step before answering."
    messages = [{"role": "user", "content": "Explain the theory of relativity in detail."}]
    response = await client.messages.acreate(
        model="claude-opus-4",
        max_tokens=1500,
        system=system_prompt,
        messages=messages
    )
    print(response.content[0].text)

asyncio.run(extended_thinking_async())

output

The theory of relativity can be understood by first considering... [detailed step-by-step explanation]

Troubleshooting

If responses are too short or lack depth, increase max_tokens and refine the system prompt to explicitly request step-by-step reasoning. Also, verify your API key and model name are correct. If you hit token limits, consider splitting queries or summarizing intermediate steps.

✅

Key Takeaways

Use Claude models like claude-3-5-sonnet-20241022 with high max_tokens for extended thinking.
Set a clear system prompt to instruct the model to reason step-by-step.
Async calls are supported in the anthropic SDK for flexible integration.
Adjust token limits and system instructions if responses lack depth or are cut off.

Verified 2026-04 · claude-3-5-sonnet-20241022, claude-opus-4

Verify ↗