API Intermediate medium · 5 min

Extracting nested fields from JSON response

What you will learn
Navigate and safely extract deeply nested values from OpenAI API response objects using attribute access and type-safe patterns.

Why this matters

OpenAI responses contain nested structures (choices → message → content, usage → prompt_tokens) that developers often access incorrectly, leading to AttributeError or missed fields like finish_reason. Learning the correct access pattern prevents runtime crashes in production and helps you discover response fields that unlock functionality.

Skip if: Use response.model_dump() or response.dict() only if you need to serialize the entire response to JSON or pass it to external systems. For normal field extraction, stick with attribute access: it's type-safe and caught by your linter.

Explanation

What it does: OpenAI's Python SDK returns strongly-typed response objects (not raw dicts) where nested fields are accessed via dot notation. This differs from older requests-based code where you'd use response['choices'][0]['message']['content'].

How it works: The SDK uses Pydantic models under the hood. When you call client.chat.completions.create(), it returns a ChatCompletion object. Fields like choices, usage, and model are Pydantic model instances, not dicts. Accessing them via dot notation gives you type hints in your IDE and validation at instantiation time. If a field doesn't exist, you get an AttributeError immediately instead of a silent None.

When to use it: Always access nested response fields via dot notation (response.choices[0].message.content) rather than dict keys. Only convert to dict if you need JSON serialization or are passing data to systems that require dict inputs.

Request code

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

response = client.chat.completions.create(
    model='gpt-4-turbo',
    messages=[
        {'role': 'user', 'content': 'What is 2+2?'}
    ]
)

message_content = response.choices[0].message.content
finish_reason = response.choices[0].finish_reason
prompt_tokens = response.usage.prompt_tokens
completion_tokens = response.usage.completion_tokens
total_tokens = response.usage.total_tokens
model_name = response.model

print(f'Content: {message_content}')
print(f'Finish reason: {finish_reason}')
print(f'Tokens - Prompt: {prompt_tokens}, Completion: {completion_tokens}, Total: {total_tokens}')
print(f'Model used: {model_name}')

Authentication

Set your OpenAI API key via environment variable before instantiation: export OPENAI_API_KEY='sk-...' Or pass it explicitly to OpenAI(api_key='sk-...'). The SDK reads OPENAI_API_KEY at object construction time, not at import time.

Response shape

FieldDescription
id Unique identifier for this completion request (e.g., 'chatcmpl-8z9...')
object Always 'chat.completion'
created Unix timestamp when the response was generated
model Model name used (e.g., 'gpt-4-turbo')
choices List of completion choices. Length depends on n parameter
choices[0] First choice object
choices[0].message Message object containing role and content
choices[0].message.content The actual text response from the model
choices[0].message.role Always 'assistant' for API responses
choices[0].finish_reason Why generation stopped ('stop', 'length', 'tool_calls', 'content_filter')
choices[0].index Position in choices array
usage Token usage object
usage.prompt_tokens Tokens in your input message
usage.completion_tokens Tokens in the response
usage.total_tokens Sum of prompt and completion tokens
usage.cache_creation_input_tokens Tokens cached for prompt caching (if enabled)
usage.cache_read_input_tokens Tokens read from cache (if enabled)

Field guide

choices[0].message.content

The primary field you'll use: contains the model's text response. Always a string.

finish_reason

Critical for production code. 'stop' means the model finished naturally. 'length' means max_tokens was hit (response is incomplete). 'content_filter' means the response was flagged. Never ignore this in logging.

usage.total_tokens

Directly tied to cost. Multiply by your model's per-1M-token price. Cache hit tokens (cache_read_input_tokens) cost 90% less than regular prompt tokens.

choices

Typically length 1, but becomes a list when n > 1. Always iterate or index safely to avoid IndexError.

Setup trap

Setting OPENAI_API_KEY in your Python code after importing OpenAI is too late. The client reads the key at OpenAI() instantiation. Set the environment variable before running your script, or pass api_key explicitly to OpenAI(api_key='...'). If you're writing a library, lazy-load the client or require callers to pass a pre-configured OpenAI instance.

Cost

Each API call costs based on input + output tokens. In the response, check usage.cache_read_input_tokens: these cost 90% less than regular prompt tokens. With prompt caching enabled, repeated requests to the same context (e.g., large system prompts) can save significant money. Monitor total_tokens in production; a single call can easily cost $0.01–$1.00 depending on model and length.

Rate limits

You'll hit rate limits (429 error) if you make >10,000 requests/min on free tier, or exceed token-per-minute limits on paid. Extract finish_reason == 'length' responses early to avoid wasted token spend on incomplete outputs. Implement exponential backoff for retries rather than immediate retry.

Common gotcha

Trying to access response['choices'][0]['message']['content'] like a dict will fail immediately with AttributeError. The SDK returns Pydantic model objects, not dicts. Use dot notation: response.choices[0].message.content. Your IDE autocomplete only works with dot notation, so you'll catch typos.

Error recovery

AttributeError: 'ChatCompletion' object has no attribute 'choices'
Verify response object exists and is not None. Check if you're treating response like a dict. Use dot notation, not bracket notation.
IndexError: list index out of range
You're accessing choices[0] but the response has no choices. This happens if the API returned an error wrapped in a 200 OK (rare). Always check len(response.choices) > 0 before indexing.
AuthenticationError
OPENAI_API_KEY is not set, expired, or incorrect. Verify it's set in your environment before OpenAI() instantiation. Use client = OpenAI(api_key='sk-...') to debug.
RateLimitError
Too many requests in a short window. Implement exponential backoff: retry after 2^attempt seconds, up to a maximum.
APIConnectionError
Network issue or OpenAI service is down. Retry with exponential backoff. Check client.base_url if using a proxy.

Experienced dev note

The finish_reason field is your canary in the coal mine. In production, log every response with finish_reason != 'stop' as a warning. If finish_reason == 'length', your max_tokens is too low and responses are truncated: users are getting incomplete answers. If it's 'content_filter', the model rejected the input or output; log it for compliance and UX debugging. Also: cache_read_input_tokens are invisible to inexperienced devs but can cut your token costs by 70% on repeated contexts. Always enable prompt caching for system prompts > 1KB that you use across multiple requests.

Check your understanding

You're building a chat app that streams responses. Why would you need to check finish_reason even though the stream completed? What could go wrong if you ignore it?

Show answer hint

finish_reason tells you whether the model finished naturally ('stop') or hit a limit ('length'). If it's 'length', the response is truncated mid-sentence, which users won't see if you don't handle streaming properly. You must check finish_reason after streaming completes to know if you need to request more tokens or alert the user.

VERSION openai >= 1.0.0 returns Pydantic v2 models. Older code using openai < 1.0.0 returned dicts. If upgrading legacy code, change all response['field'] to response.field. The SDK no longer supports dict-style access.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.