Code beginner · 3 min read

How to use Claude for JSON output in python

Q: How to use Claude for JSON output in python

Use the anthropic Python SDK to call client.messages.create with system instructions to output JSON, then parse the JSON string from response.content[0].text.

Direct answer

Use the anthropic Python SDK to call client.messages.create with system instructions to output JSON, then parse the JSON string from response.content[0].text.

Setup

Install

bash

pip install anthropic

Env vars

ANTHROPIC_API_KEY

Imports

python

import os
import json
import anthropic

Examples

inGenerate a JSON object with name and age for a person named Alice, age 30.

out{"name": "Alice", "age": 30}

inReturn a JSON array of three fruits with their colors.

out[{"fruit": "apple", "color": "red"}, {"fruit": "banana", "color": "yellow"}, {"fruit": "grape", "color": "purple"}]

inProvide a JSON object with keys "success" (bool) and "message" (string) indicating operation status.

out{"success": true, "message": "Operation completed successfully."}

Integration steps

Install the Anthropic Python SDK and set the ANTHROPIC_API_KEY environment variable.
Import anthropic, os, and json modules.
Initialize the Anthropic client with the API key from os.environ.
Create a messages.create call with the system prompt instructing Claude to respond in JSON format and the user prompt with the request.
Parse the JSON string from response.content[0].text using json.loads().
Use or print the parsed JSON object as needed.

Full code

python

import os
import json
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = (
    "You are a helpful assistant that responds ONLY with valid JSON. "
    "Do not include any explanations or extra text."
)

user_prompt = "Generate a JSON object with keys 'name' and 'age' for a person named Alice, age 30."

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system=system_prompt,
    messages=[{"role": "user", "content": user_prompt}]
)

json_text = response.content[0].text

try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON. Raw output:", json_text)

output

Parsed JSON output: {'name': 'Alice', 'age': 30}

API trace

Request

json

{"model": "claude-3-5-sonnet-20241022", "max_tokens": 300, "system": "You are a helpful assistant that responds ONLY with valid JSON.", "messages": [{"role": "user", "content": "Generate a JSON object with keys 'name' and 'age' for a person named Alice, age 30."}]}

Response

json

{"id": "chatcmpl-xxx", "object": "chat.completion", "model": "claude-3-5-sonnet-20241022", "choices": [{"index": 0, "message": {"role": "assistant", "content": "{\"name\": \"Alice\", \"age\": 30}"}}], "usage": {"prompt_tokens": 50, "completion_tokens": 20, "total_tokens": 70}}

Extractresponse.content[0].text

Variants

Streaming JSON output ›

Use streaming when expecting large JSON outputs to improve responsiveness and user experience.

python

import os
import json
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = (
    "You are a helpful assistant that streams valid JSON output only."
)

user_prompt = "Generate a JSON array of three fruits with their colors."

stream = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system=system_prompt,
    messages=[{"role": "user", "content": user_prompt}],
    stream=True
)

json_chunks = []
for chunk in stream:
    print(chunk.content[0].text, end='')
    json_chunks.append(chunk.content[0].text)

# After streaming, join and parse
json_text = ''.join(json_chunks)
try:
    data = json.loads(json_text)
    print("\nParsed JSON:", data)
except json.JSONDecodeError:
    print("\nFailed to parse streamed JSON.")

Async JSON generation ›

Use async calls to handle multiple concurrent JSON generation requests efficiently.

python

import os
import json
import asyncio
import anthropic

async def generate_json():
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

    system_prompt = (
        "You are a helpful assistant that responds ONLY with valid JSON."
    )

    user_prompt = "Provide a JSON object with keys 'success' and 'message'."

    response = await client.messages.acreate(
        model="claude-3-5-sonnet-20241022",
        max_tokens=300,
        system=system_prompt,
        messages=[{"role": "user", "content": user_prompt}]
    )

    json_text = response.content[0].text
    try:
        data = json.loads(json_text)
        print("Parsed JSON output:", data)
    except json.JSONDecodeError:
        print("Failed to parse JSON. Raw output:", json_text)

asyncio.run(generate_json())

Use smaller Claude model for faster, cheaper JSON output ›

Use a smaller Claude model to reduce latency and cost when JSON output complexity is low.

python

import os
import json
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

system_prompt = "Respond ONLY with valid JSON."
user_prompt = "Generate a JSON object with keys 'status' and 'code'."

response = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=200,
    system=system_prompt,
    messages=[{"role": "user", "content": user_prompt}]
)

json_text = response.content[0].text
try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON. Raw output:", json_text)

Performance

Latency~1.2s for typical 200-token JSON output with claude-3-5-sonnet-20241022

Cost~$0.003 per 200 tokens on claude-3-5-sonnet-20241022

Rate limitsDefault tier: 300 requests per minute, 60,000 tokens per minute

Keep system and user prompts concise to reduce prompt tokens.
Limit max_tokens to the expected JSON size to avoid over-generation.
Reuse context where possible to avoid repeated prompt tokens.

Approach	Latency	Cost/call	Best for
Standard synchronous call	~1.2s	~$0.003	Simple JSON generation
Streaming output	Starts immediately, total ~1.2s	~$0.003	Large JSON or UX-sensitive apps
Async calls	~1.2s per call, concurrent	~$0.003	High concurrency scenarios
Smaller Claude model	~0.8s	~$0.0015	Low complexity JSON, cost-sensitive

✓

Quick tip

Always instruct Claude explicitly in the <code>system</code> prompt to respond only with valid JSON to simplify parsing.

⚠

Common mistake

Beginners often forget to parse the JSON string from <code>response.content[0].text</code> and try to use the raw string directly.

Verified 2026-04 · claude-3-5-sonnet-20241022, claude-3-opus-20240229

Verify ↗