How to beginner · 3 min read

How to monitor OpenAI assistant usage

Quick answer
Use the OpenAI Python SDK to call your assistant and inspect the usage field in the response, which includes token counts for prompt, completion, and total tokens. This lets you monitor usage programmatically and integrate cost tracking into your application.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official openai Python SDK and set your API key as an environment variable for secure authentication.

bash
pip install openai>=1.0

Step by step

This example demonstrates how to send a chat completion request to the OpenAI assistant and extract usage details from the response.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, how many tokens am I using?"}]
)

print("Assistant reply:", response.choices[0].message.content)
print("Usage details:", response.usage)

# Usage fields include prompt_tokens, completion_tokens, total_tokens
prompt_tokens = response.usage.prompt_tokens
completion_tokens = response.usage.completion_tokens
total_tokens = response.usage.total_tokens

print(f"Prompt tokens: {prompt_tokens}")
print(f"Completion tokens: {completion_tokens}")
print(f"Total tokens: {total_tokens}")
output
Assistant reply: Hello! You are currently using 15 tokens in this conversation.
Usage details: {'prompt_tokens': 10, 'completion_tokens': 5, 'total_tokens': 15}
Prompt tokens: 10
Completion tokens: 5
Total tokens: 15

Common variations

You can monitor usage asynchronously or with different models by adjusting the model parameter or using async calls with asyncio. Usage tracking works similarly across all OpenAI chat models.

python
import os
import asyncio
from openai import OpenAI

async def async_usage_monitor():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Check my usage asynchronously."}]
    )
    print("Async assistant reply:", response.choices[0].message.content)
    print("Async usage details:", response.usage)

asyncio.run(async_usage_monitor())
output
Async assistant reply: Your usage tokens are tracked in this async call.
Async usage details: {'prompt_tokens': 12, 'completion_tokens': 7, 'total_tokens': 19}

Troubleshooting

  • If response.usage is missing, ensure you are using the latest openai SDK version 1.0 or higher.
  • Check that your API key is correctly set in os.environ["OPENAI_API_KEY"].
  • Usage data is only returned for models that support token usage reporting; verify your model supports it.

Key Takeaways

  • Use the usage field in OpenAI SDK responses to monitor token consumption.
  • Track prompt_tokens, completion_tokens, and total_tokens for cost insights.
  • Ensure your environment variable OPENAI_API_KEY is set for authentication.
  • Async usage monitoring is supported with the acreate method.
  • Keep your SDK updated to access the latest usage tracking features.
Verified 2026-04 · gpt-4o-mini
Verify ↗