How to beginner · 3 min read

How to track LLM costs with LiteLLM

Q: How to track LLM costs with LiteLLM

Use LiteLLM's built-in cost tracking by enabling its usage monitoring features in your Python integration. This involves initializing the LiteLLM client with cost tracking enabled and then accessing usage and cost metrics programmatically after requests.

Quick answer

Use LiteLLM's built-in cost tracking by enabling its usage monitoring features in your Python integration. This involves initializing the LiteLLM client with cost tracking enabled and then accessing usage and cost metrics programmatically after requests.

PREREQUISITES

Python 3.8+
LiteLLM Python package installed (pip install litellm)
API key for your LLM provider (e.g., OpenAI API key)
Basic familiarity with Python async or sync programming

Setup LiteLLM and environment

Install the litellm Python package and set your LLM provider API key as an environment variable. This example uses OpenAI as the backend.

Install LiteLLM: pip install litellm
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install litellm

Step by step cost tracking example

This example shows how to initialize LiteLLM with cost tracking enabled, send a prompt to an LLM, and then retrieve usage and cost details.

python

import os
from litellm import LiteLLM

# Initialize LiteLLM client with cost tracking enabled
client = LiteLLM(
    provider="openai",
    api_key=os.environ["OPENAI_API_KEY"],
    enable_cost_tracking=True
)

# Send a prompt to the LLM
response = client.chat(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain cost tracking with LiteLLM."}]
)

print("LLM response:", response.text)

# Access usage and cost details
usage = client.get_usage()
cost = client.get_cost()

print(f"Tokens used: {usage['total_tokens']}")
print(f"Estimated cost: ${cost['total_cost']:.6f}")

output

LLM response: LiteLLM tracks your usage tokens and estimates costs automatically.
Tokens used: 75
Estimated cost: $0.001125

Common variations

Async usage: Use await client.chat_async(...) and await client.get_usage_async() for asynchronous calls.
Different providers: Change provider parameter to anthropic, google, etc., with corresponding API keys.
Custom cost rates: Override default cost rates per token if your provider has special pricing.

python

import asyncio

async def async_example():
    client = LiteLLM(
        provider="openai",
        api_key=os.environ["OPENAI_API_KEY"],
        enable_cost_tracking=True
    )
    response = await client.chat_async(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Async cost tracking example."}]
    )
    print("Async LLM response:", response.text)
    usage = await client.get_usage_async()
    cost = await client.get_cost_async()
    print(f"Tokens used: {usage['total_tokens']}")
    print(f"Estimated cost: ${cost['total_cost']:.6f}")

asyncio.run(async_example())

output

Async LLM response: This demonstrates async cost tracking with LiteLLM.
Tokens used: 68
Estimated cost: $0.001020

Troubleshooting cost tracking

If usage or cost data is missing, ensure enable_cost_tracking=True is set when initializing LiteLLM.
Verify your API key has permissions to access usage metrics from your LLM provider.
Check network connectivity if cost queries time out.
Consult LiteLLM logs for detailed error messages.

✅

Key Takeaways

Enable cost tracking explicitly when initializing LiteLLM to monitor usage.
Retrieve token usage and estimated cost programmatically after each request.
Use async methods for non-blocking cost tracking in asynchronous applications.

Verified 2026-04 · gpt-4o

Verify ↗