How to beginner · 3 min read

How to track LLM costs with LiteLLM

Quick answer
Use LiteLLM's built-in cost tracking by enabling its usage monitoring features in your Python integration. This involves initializing the LiteLLM client with cost tracking enabled and then accessing usage and cost metrics programmatically after requests.

PREREQUISITES

  • Python 3.8+
  • LiteLLM Python package installed (pip install litellm)
  • API key for your LLM provider (e.g., OpenAI API key)
  • Basic familiarity with Python async or sync programming

Setup LiteLLM and environment

Install the litellm Python package and set your LLM provider API key as an environment variable. This example uses OpenAI as the backend.

  • Install LiteLLM: pip install litellm
  • Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)
bash
pip install litellm

Step by step cost tracking example

This example shows how to initialize LiteLLM with cost tracking enabled, send a prompt to an LLM, and then retrieve usage and cost details.

python
import os
from litellm import LiteLLM

# Initialize LiteLLM client with cost tracking enabled
client = LiteLLM(
    provider="openai",
    api_key=os.environ["OPENAI_API_KEY"],
    enable_cost_tracking=True
)

# Send a prompt to the LLM
response = client.chat(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain cost tracking with LiteLLM."}]
)

print("LLM response:", response.text)

# Access usage and cost details
usage = client.get_usage()
cost = client.get_cost()

print(f"Tokens used: {usage['total_tokens']}")
print(f"Estimated cost: ${cost['total_cost']:.6f}")
output
LLM response: LiteLLM tracks your usage tokens and estimates costs automatically.
Tokens used: 75
Estimated cost: $0.001125

Common variations

  • Async usage: Use await client.chat_async(...) and await client.get_usage_async() for asynchronous calls.
  • Different providers: Change provider parameter to anthropic, google, etc., with corresponding API keys.
  • Custom cost rates: Override default cost rates per token if your provider has special pricing.
python
import asyncio

async def async_example():
    client = LiteLLM(
        provider="openai",
        api_key=os.environ["OPENAI_API_KEY"],
        enable_cost_tracking=True
    )
    response = await client.chat_async(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Async cost tracking example."}]
    )
    print("Async LLM response:", response.text)
    usage = await client.get_usage_async()
    cost = await client.get_cost_async()
    print(f"Tokens used: {usage['total_tokens']}")
    print(f"Estimated cost: ${cost['total_cost']:.6f}")

asyncio.run(async_example())
output
Async LLM response: This demonstrates async cost tracking with LiteLLM.
Tokens used: 68
Estimated cost: $0.001020

Troubleshooting cost tracking

  • If usage or cost data is missing, ensure enable_cost_tracking=True is set when initializing LiteLLM.
  • Verify your API key has permissions to access usage metrics from your LLM provider.
  • Check network connectivity if cost queries time out.
  • Consult LiteLLM logs for detailed error messages.

Key Takeaways

  • Enable cost tracking explicitly when initializing LiteLLM to monitor usage.
  • Retrieve token usage and estimated cost programmatically after each request.
  • Use async methods for non-blocking cost tracking in asynchronous applications.
Verified 2026-04 · gpt-4o
Verify ↗