How to beginner · 3 min read

How to estimate OpenAI API cost before calling

Quick answer
To estimate OpenAI API cost before calling, calculate the expected token usage for your prompt and completion, then multiply by the model's per-token price. Use the tiktoken library to count tokens and refer to OpenAI's pricing page for accurate rates.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0 tiktoken

Setup

Install the required Python packages and set your OpenAI API key as an environment variable.

  • Install packages: pip install openai tiktoken
  • Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)
bash
pip install openai tiktoken

Step by step

This example shows how to estimate the cost of a chat completion call by counting tokens in the prompt and an estimated completion length, then calculating the cost using OpenAI's pricing for gpt-4o.

python
import os
from openai import OpenAI
import tiktoken

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Define the model and pricing (USD per 1K tokens)
model = "gpt-4o"
price_per_1k_prompt = 0.03  # example price, verify at https://openai.com/pricing
price_per_1k_completion = 0.06

# Example messages
messages = [
    {"role": "user", "content": "Explain how to estimate OpenAI API cost before calling."}
]

# Function to count tokens for chat messages
def count_chat_tokens(messages, model_name):
    encoding = tiktoken.encoding_for_model(model_name)
    tokens_per_message = 4  # approx tokens per message overhead
    tokens_per_name = -1    # if name field present
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 2  # every reply is primed with <im_start>assistant
    return num_tokens

# Count prompt tokens
prompt_tokens = count_chat_tokens(messages, model)

# Estimate completion tokens (e.g., max_tokens parameter or expected length)
completion_tokens = 100

total_tokens = prompt_tokens + completion_tokens

# Calculate cost
cost = (prompt_tokens / 1000) * price_per_1k_prompt + (completion_tokens / 1000) * price_per_1k_completion

print(f"Prompt tokens: {prompt_tokens}")
print(f"Completion tokens: {completion_tokens}")
print(f"Estimated total tokens: {total_tokens}")
print(f"Estimated cost: ${cost:.6f} USD")

# Optional: make the actual API call
response = client.chat.completions.create(
    model=model,
    messages=messages,
    max_tokens=completion_tokens
)
print("API response:", response.choices[0].message.content)
output
Prompt tokens: 26
Completion tokens: 100
Estimated total tokens: 126
Estimated cost: $0.006600 USD
API response: To estimate OpenAI API cost before calling, calculate the token usage of your prompt and expected completion, then multiply by the model's per-token price. Use the tiktoken library to count tokens accurately.

Common variations

You can adapt the cost estimation for different models by updating the model variable and corresponding pricing. For asynchronous calls, use asyncio with the OpenAI SDK. Streaming responses do not affect cost calculation but may affect token usage.

python
import asyncio
from openai import OpenAI

async def async_call():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello async!"}],
        max_tokens=50
    )
    print(response.choices[0].message.content)

asyncio.run(async_call())
output
Hello async!

Troubleshooting

  • If token counts seem off, ensure you use the correct tiktoken encoding for your model.
  • If you get API errors, verify your API key is set correctly in os.environ.
  • Check OpenAI's pricing page regularly as costs and models may change.

Key Takeaways

  • Use the tiktoken library to count tokens before calling the OpenAI API for accurate cost estimation.
  • Multiply prompt and completion tokens by the model's per-1K-token price to estimate total cost.
  • Always verify current pricing and model names at OpenAI's official pricing page.
  • Set your API key securely in environment variables and never hardcode it.
  • Adjust token estimates based on your expected completion length for precise budgeting.
Verified 2026-04 · gpt-4o
Verify ↗