API Beginner easy · 4 min

The response object: what it contains

What you will learn

Every OpenAI API call returns a structured response object containing the model's output, usage metrics, and metadata you need to build production systems.

Why this matters

You can't use the API effectively without knowing where the actual answer lives in the response, how many tokens you burned, or what model version actually ran. Missing fields cause silent bugs.

Skip if: If you're building a simple script that only needs the text output, you could skip this and just grab response.choices[0].message.content. But the moment you care about cost tracking, retry logic, or debugging, you need the full response structure.

Explanation

What the response object contains: When you call client.chat.completions.create(), you get back a ChatCompletion object. This isn't a dictionary: it's a Pydantic model with typed fields. It contains: the actual completion text (inside choices), token usage counts, the model that ran, finish reason, and timestamps. How it works: The OpenAI API sends back JSON from the server. The Python SDK automatically parses it into a ChatCompletion object with dot-notation access. The choices field is a list because the API can generate multiple completions in one request (controlled by the n parameter). The usage field tells you prompt tokens, completion tokens, and total tokens consumed. When to use it: Always capture the full response object, even if you only need the text right now. You'll need usage data for cost tracking, finish_reason to detect truncation, and model to verify the right version ran.

Request code

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

response = client.chat.completions.create(
    model='gpt-4.1',
    messages=[
        {'role': 'user', 'content': 'What is 2+2?'}
    ]
)

print('Completion text:', response.choices[0].message.content)
print('Tokens used:', response.usage.total_tokens)
print('Finish reason:', response.choices[0].finish_reason)
print('Model:', response.model)
print('Response ID:', response.id)

Authentication

Set your API key before instantiating the client. The OpenAI SDK reads the OPENAI_API_KEY environment variable automatically, or you can pass it explicitly: client = OpenAI(api_key='sk-...'). Test your key works by making one request before writing production code.

Response shape

Field	Description
`id`	string: unique identifier for this completion (e.g., 'chatcmpl-8a9...')
`object`	string: always 'chat.completion'
`created`	integer: Unix timestamp when response was generated
`model`	string: name of the model that was used (e.g., 'gpt-4.1')
`choices`	list of choice objects, each containing: message (with role and content), finish_reason, and index
`choices[0].message.content`	string: the actual text response from the model
`choices[0].finish_reason`	string: 'stop' (normal end), 'length' (hit max_tokens), 'tool_calls' (called a function), or 'content_filter' (blocked)
`usage`	object containing prompt_tokens, completion_tokens, total_tokens
`usage.prompt_tokens`	integer: tokens in your input
`usage.completion_tokens`	integer: tokens in the model's output
`usage.total_tokens`	integer: sum of prompt and completion tokens

Field guide

choices[0].message.content

This is what you actually want: the model's answer. Always check that choices is not empty and access [0] for the first (usually only) completion.

finish_reason

Tells you WHY the response ended. 'stop' means normal completion. 'length' means you hit max_tokens and the answer was cut off: increase max_tokens or split the request. 'content_filter' means OpenAI blocked it for policy reasons.

usage.total_tokens

Multiply by your model's price per token to get actual cost. Store this for billing reconciliation. OpenAI's estimates in docs may differ slightly from actual: this field is ground truth.

model

Confirms which model actually ran. If you request 'gpt-4.1' but get 'gpt-4-turbo' back, something's wrong. Use this in logging to debug routing issues.

id

Include this in error reports or support tickets. OpenAI uses it to look up your exact request in their logs.

Setup trap

Setting OPENAI_API_KEY in your code before instantiating OpenAI() does work: the SDK reads the environment at init time. The actual gotcha: if you set it after OpenAI() is called, it's already too late. Initialize in order: environment variable first, then instantiate client. Also: if you're running in a container, make sure secrets are passed at runtime, not baked into the image.

Cost

Each token costs money. Input tokens are usually 0.5x to 1x the price of output tokens. A single ChatCompletion request with usage tracking lets you bill users accurately. For gpt-4.1: ~0.03 USD per 1K input tokens, ~0.06 per 1K output tokens. Track usage.total_tokens and multiply by your model's rate. Don't estimate: use the actual field.

Rate limits

OpenAI enforces rate limits per minute and per day. If you're making many requests rapidly (e.g., batch processing 10K documents), you'll hit the per-minute limit. The SDK will raise a RateLimitError. Response object itself doesn't include rate limit headers in the 1.x SDK: you'd need to catch the exception and implement exponential backoff separately.

Common gotcha

Accessing response.choices[0].message.content without checking if choices is empty. If the API fails silently or returns an unexpected response structure, this throws an IndexError that's cryptic. Always check: if response.choices: text = response.choices[0].message.content.

Error recovery

APIConnectionError

Network issue or API endpoint unreachable. Check internet connection, retry with exponential backoff (wait 1s, then 2s, then 4s). The response object won't exist: the error is raised before you get one.

AuthenticationError

API key is invalid, expired, or missing. Check OPENAI_API_KEY is set correctly. If using explicit api_key parameter, verify it's not truncated or typo'd.

RateLimitError

You've exceeded your rate limit (requests per minute). Implement exponential backoff: catch the error, wait, retry. The response object isn't returned: you catch the exception.

APIError

Catch-all for server-side errors (5xx). Usually temporary. Retry with backoff. Check OpenAI status page.

ValueError

If you're getting this when accessing the response, you likely passed an invalid model name or malformed messages. Verify model='gpt-4.1' is spelled correctly.

Experienced dev note

Log the entire response.id with every completion in production. When a user says 'your answer was wrong' or 'I got charged twice', you can grep your logs for that response ID and cross-reference it with OpenAI's billing. Also: response.model tells you which version actually ran: crucial for debugging. If you A/B test models, this field proves which one the user got, preventing blame-shifting. One more: finish_reason='length' is a silent failure mode. Set max_tokens high enough and always check finish_reason in monitoring. Truncated outputs look plausible but wrong.

Check your understanding

You're calling the API and get back a response. The finish_reason is 'length'. Your code extracted response.choices[0].message.content successfully. Should you ship this response to the user? Why or why not?

Show answer hint

finish_reason='length' means the model hit max_tokens and the answer was cut off mid-sentence. It's incomplete. The text may look grammatically correct but semantically wrong. You should either increase max_tokens and retry, or inform the user that the response was truncated.

VERSION Breaking change from 0.x to 1.x: response is now a Pydantic model (typed object), not a dictionary. Access fields with dot notation (response.choices[0].message.content), not brackets (response['choices'][0]['message']['content']). If you see ['error']['code'] in examples online, that's 0.x syntax: ignore it. Use 1.x docs at platform.openai.com.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.