How to beginner · 3 min read

How to monitor OpenAI assistant usage

Q: How to monitor OpenAI assistant usage

Use the OpenAI Python SDK to call your assistant and inspect the usage field in the response, which includes token counts for prompt, completion, and total tokens. This lets you monitor usage programmatically and integrate cost tracking into your application.

Quick answer

Use the OpenAI Python SDK to call your assistant and inspect the usage field in the response, which includes token counts for prompt, completion, and total tokens. This lets you monitor usage programmatically and integrate cost tracking into your application.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official openai Python SDK and set your API key as an environment variable for secure authentication.

bash

pip install openai>=1.0

Step by step

This example demonstrates how to send a chat completion request to the OpenAI assistant and extract usage details from the response.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, how many tokens am I using?"}]
)

print("Assistant reply:", response.choices[0].message.content)
print("Usage details:", response.usage)

# Usage fields include prompt_tokens, completion_tokens, total_tokens
prompt_tokens = response.usage.prompt_tokens
completion_tokens = response.usage.completion_tokens
total_tokens = response.usage.total_tokens

print(f"Prompt tokens: {prompt_tokens}")
print(f"Completion tokens: {completion_tokens}")
print(f"Total tokens: {total_tokens}")

output

Assistant reply: Hello! You are currently using 15 tokens in this conversation.
Usage details: {'prompt_tokens': 10, 'completion_tokens': 5, 'total_tokens': 15}
Prompt tokens: 10
Completion tokens: 5
Total tokens: 15

Common variations

You can monitor usage asynchronously or with different models by adjusting the model parameter or using async calls with asyncio. Usage tracking works similarly across all OpenAI chat models.

python

import os
import asyncio
from openai import OpenAI

async def async_usage_monitor():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Check my usage asynchronously."}]
    )
    print("Async assistant reply:", response.choices[0].message.content)
    print("Async usage details:", response.usage)

asyncio.run(async_usage_monitor())

output

Async assistant reply: Your usage tokens are tracked in this async call.
Async usage details: {'prompt_tokens': 12, 'completion_tokens': 7, 'total_tokens': 19}

Troubleshooting

If response.usage is missing, ensure you are using the latest openai SDK version 1.0 or higher.
Check that your API key is correctly set in os.environ["OPENAI_API_KEY"].
Usage data is only returned for models that support token usage reporting; verify your model supports it.

Key Takeaways

Use the usage field in OpenAI SDK responses to monitor token consumption.
Track prompt_tokens, completion_tokens, and total_tokens for cost insights.
Ensure your environment variable OPENAI_API_KEY is set for authentication.
Async usage monitoring is supported with the acreate method.
Keep your SDK updated to access the latest usage tracking features.

Verified 2026-04 · gpt-4o-mini

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.