LiteLLMCostTrackingBudgetExceededError
litellm.errors.LiteLLMCostTrackingBudgetExceededError
Stack trace
litellm.errors.LiteLLMCostTrackingBudgetExceededError: Cost tracking budget exceeded: current usage 105% of limit 100 USD
File "/app/main.py", line 42, in run_query
response = client.chat.completions.create(model="gpt-4o", messages=messages)
File "/usr/local/lib/python3.9/site-packages/litellm/client.py", line 88, in create
raise LiteLLMCostTrackingBudgetExceededError("Cost tracking budget exceeded") Why it happens
LiteLLM monitors API usage costs against a predefined budget to prevent unexpected charges. When the cumulative cost of requests exceeds this budget, it raises this error to halt further usage until the budget is increased or reset.
Detection
Monitor LiteLLM's cost tracking logs or catch LiteLLMCostTrackingBudgetExceededError exceptions to detect when usage surpasses the configured budget before your app crashes.
Causes & fixes
The configured cost tracking budget is too low for the volume or complexity of requests.
Increase the cost tracking budget in your LiteLLM client configuration to accommodate your expected usage.
Unexpectedly high usage or large prompt sizes causing rapid budget consumption.
Implement usage monitoring and rate limiting to control request volume and prompt size, preventing budget overruns.
Cost tracking feature enabled without setting an appropriate budget value.
Set a realistic cost tracking budget value in the LiteLLM client initialization to avoid automatic blocking.
Code: broken vs fixed
import os
from litellm import LiteLLMClient
client = LiteLLMClient(api_key=os.environ["LITELLM_API_KEY"], cost_tracking_budget=1) # Too low budget
messages = [{"role": "user", "content": "Hello"}]
response = client.chat.completions.create(model="gpt-4o", messages=messages) # Raises LiteLLMCostTrackingBudgetExceededError import os
from litellm import LiteLLMClient
client = LiteLLMClient(api_key=os.environ["LITELLM_API_KEY"], cost_tracking_budget=100) # Increased budget to 100 USD
messages = [{"role": "user", "content": "Hello"}]
response = client.chat.completions.create(model="gpt-4o", messages=messages) # Works without error
print(response) Workaround
Catch LiteLLMCostTrackingBudgetExceededError in a try/except block and queue or delay requests until the budget resets or is increased.
Prevention
Set an appropriate cost tracking budget based on expected usage and monitor usage metrics to adjust budgets proactively before hitting limits.