API Intermediate medium · 5 min

Extracting nested fields from JSON response

What you will learn

Navigate and safely extract deeply nested values from OpenAI API response objects using attribute access and type-safe patterns.

Why this matters

OpenAI responses contain nested structures (choices → message → content, usage → prompt_tokens) that developers often access incorrectly, leading to AttributeError or missed fields like finish_reason. Learning the correct access pattern prevents runtime crashes in production and helps you discover response fields that unlock functionality.

Skip if: Use response.model_dump() or response.dict() only if you need to serialize the entire response to JSON or pass it to external systems. For normal field extraction, stick with attribute access: it's type-safe and caught by your linter.

Explanation

What it does: OpenAI's Python SDK returns strongly-typed response objects (not raw dicts) where nested fields are accessed via dot notation. This differs from older requests-based code where you'd use response['choices'][0]['message']['content'].

How it works: The SDK uses Pydantic models under the hood. When you call client.chat.completions.create(), it returns a ChatCompletion object. Fields like choices, usage, and model are Pydantic model instances, not dicts. Accessing them via dot notation gives you type hints in your IDE and validation at instantiation time. If a field doesn't exist, you get an AttributeError immediately instead of a silent None.

When to use it: Always access nested response fields via dot notation (response.choices[0].message.content) rather than dict keys. Only convert to dict if you need JSON serialization or are passing data to systems that require dict inputs.

Request code

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

response = client.chat.completions.create(
    model='gpt-4-turbo',
    messages=[
        {'role': 'user', 'content': 'What is 2+2?'}
    ]
)

message_content = response.choices[0].message.content
finish_reason = response.choices[0].finish_reason
prompt_tokens = response.usage.prompt_tokens
completion_tokens = response.usage.completion_tokens
total_tokens = response.usage.total_tokens
model_name = response.model

print(f'Content: {message_content}')
print(f'Finish reason: {finish_reason}')
print(f'Tokens - Prompt: {prompt_tokens}, Completion: {completion_tokens}, Total: {total_tokens}')
print(f'Model used: {model_name}')

Authentication

Set your OpenAI API key via environment variable before instantiation: export OPENAI_API_KEY='sk-...' Or pass it explicitly to OpenAI(api_key='sk-...'). The SDK reads OPENAI_API_KEY at object construction time, not at import time.

Response shape

Field	Description
`id`	Unique identifier for this completion request (e.g., 'chatcmpl-8z9...')
`object`	Always 'chat.completion'
`created`	Unix timestamp when the response was generated
`model`	Model name used (e.g., 'gpt-4-turbo')
`choices`	List of completion choices. Length depends on n parameter
`choices[0]`	First choice object
`choices[0].message`	Message object containing role and content
`choices[0].message.content`	The actual text response from the model
`choices[0].message.role`	Always 'assistant' for API responses
`choices[0].finish_reason`	Why generation stopped ('stop', 'length', 'tool_calls', 'content_filter')
`choices[0].index`	Position in choices array
`usage`	Token usage object
`usage.prompt_tokens`	Tokens in your input message
`usage.completion_tokens`	Tokens in the response
`usage.total_tokens`	Sum of prompt and completion tokens
`usage.cache_creation_input_tokens`	Tokens cached for prompt caching (if enabled)
`usage.cache_read_input_tokens`	Tokens read from cache (if enabled)

Field guide

choices[0].message.content

The primary field you'll use: contains the model's text response. Always a string.

finish_reason

Critical for production code. 'stop' means the model finished naturally. 'length' means max_tokens was hit (response is incomplete). 'content_filter' means the response was flagged. Never ignore this in logging.

usage.total_tokens

Directly tied to cost. Multiply by your model's per-1M-token price. Cache hit tokens (cache_read_input_tokens) cost 90% less than regular prompt tokens.

choices

Typically length 1, but becomes a list when n > 1. Always iterate or index safely to avoid IndexError.

Setup trap

Setting OPENAI_API_KEY in your Python code after importing OpenAI is too late. The client reads the key at OpenAI() instantiation. Set the environment variable before running your script, or pass api_key explicitly to OpenAI(api_key='...'). If you're writing a library, lazy-load the client or require callers to pass a pre-configured OpenAI instance.

Cost

Each API call costs based on input + output tokens. In the response, check usage.cache_read_input_tokens: these cost 90% less than regular prompt tokens. With prompt caching enabled, repeated requests to the same context (e.g., large system prompts) can save significant money. Monitor total_tokens in production; a single call can easily cost $0.01–$1.00 depending on model and length.

Rate limits

You'll hit rate limits (429 error) if you make >10,000 requests/min on free tier, or exceed token-per-minute limits on paid. Extract finish_reason == 'length' responses early to avoid wasted token spend on incomplete outputs. Implement exponential backoff for retries rather than immediate retry.

Common gotcha

Trying to access response['choices'][0]['message']['content'] like a dict will fail immediately with AttributeError. The SDK returns Pydantic model objects, not dicts. Use dot notation: response.choices[0].message.content. Your IDE autocomplete only works with dot notation, so you'll catch typos.

Error recovery

AttributeError: 'ChatCompletion' object has no attribute 'choices'

Verify response object exists and is not None. Check if you're treating response like a dict. Use dot notation, not bracket notation.

IndexError: list index out of range

You're accessing choices[0] but the response has no choices. This happens if the API returned an error wrapped in a 200 OK (rare). Always check len(response.choices) > 0 before indexing.

AuthenticationError

OPENAI_API_KEY is not set, expired, or incorrect. Verify it's set in your environment before OpenAI() instantiation. Use client = OpenAI(api_key='sk-...') to debug.

RateLimitError

Too many requests in a short window. Implement exponential backoff: retry after 2^attempt seconds, up to a maximum.

APIConnectionError

Network issue or OpenAI service is down. Retry with exponential backoff. Check client.base_url if using a proxy.

Experienced dev note

The finish_reason field is your canary in the coal mine. In production, log every response with finish_reason != 'stop' as a warning. If finish_reason == 'length', your max_tokens is too low and responses are truncated: users are getting incomplete answers. If it's 'content_filter', the model rejected the input or output; log it for compliance and UX debugging. Also: cache_read_input_tokens are invisible to inexperienced devs but can cut your token costs by 70% on repeated contexts. Always enable prompt caching for system prompts > 1KB that you use across multiple requests.

Check your understanding

You're building a chat app that streams responses. Why would you need to check finish_reason even though the stream completed? What could go wrong if you ignore it?

Show answer hint

finish_reason tells you whether the model finished naturally ('stop') or hit a limit ('length'). If it's 'length', the response is truncated mid-sentence, which users won't see if you don't handle streaming properly. You must check finish_reason after streaming completes to know if you need to request more tokens or alert the user.

VERSION openai >= 1.0.0 returns Pydantic v2 models. Older code using openai < 1.0.0 returned dicts. If upgrading legacy code, change all response['field'] to response.field. The SDK no longer supports dict-style access.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.