How to beginner · 3 min read

How to use Claude thinking tokens

Quick answer

Claude thinking tokens are special tokens used internally by Anthropic's Claude models to represent intermediate reasoning steps, enabling clearer and more structured thought processes. To leverage them, use the Claude models like claude-3-5-sonnet-20241022 with prompts that encourage step-by-step reasoning or chain-of-thought, which triggers the model to generate these tokens implicitly for better reasoning outputs.

PREREQUISITES

Python 3.8+
Anthropic API key
pip install anthropic>=0.20

Setup

Install the Anthropic Python SDK and set your API key as an environment variable to access Claude models.

bash

pip install anthropic>=0.20

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

Step by step

Use the claude-3-5-sonnet-20241022 model with prompts that explicitly ask for step-by-step reasoning. This encourages the model to generate internal thinking tokens that improve reasoning clarity.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = "You are a helpful assistant that explains reasoning step-by-step."
user_prompt = "Explain how to solve 24 divided by 3, showing your thinking steps."

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system=system_prompt,
    messages=[{"role": "user", "content": user_prompt}]
)

print(response.content[0].text)

output

Step 1: Understand the problem: We need to divide 24 by 3.
Step 2: Recall division basics: Dividing means splitting into equal parts.
Step 3: Calculate: 24 divided by 3 equals 8.
Answer: 8

Common variations

You can use asynchronous calls or different Claude models like claude-sonnet-4-5 for similar reasoning tasks. Adjust max_tokens and prompt style to optimize the thinking token usage.

python

import asyncio
import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

async def async_reasoning():
    response = await client.messages.acreate(
        model="claude-sonnet-4-5",
        max_tokens=300,
        system="You are a reasoning assistant.",
        messages=[{"role": "user", "content": "Explain the steps to find the square root of 81."}]
    )
    print(response.content[0].text)

asyncio.run(async_reasoning())

output

Step 1: Understand the problem: Find the square root of 81.
Step 2: Recall that the square root of a number is a value that, when multiplied by itself, gives the original number.
Step 3: Calculate: 9 times 9 equals 81.
Answer: 9

Troubleshooting

If the model output lacks detailed reasoning, try explicitly prompting for step-by-step explanations or increase max_tokens. Also, ensure you use the latest Claude models that support thinking tokens effectively.

Key Takeaways

Use Claude models with prompts that request step-by-step reasoning to trigger thinking tokens.
Thinking tokens are internal to Claude and improve reasoning clarity without explicit user handling.
Adjust prompt style and token limits to optimize reasoning output quality.
Use the latest Claude models like claude-3-5-sonnet-20241022 for best results.
Async calls and different Claude variants support thinking tokens similarly.

Verified 2026-04 · claude-3-5-sonnet-20241022, claude-sonnet-4-5

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.