How to beginner · 3 min read

How to use LiteLLM with Langfuse

Quick answer
Use the litellm Python client to run your language model locally or remotely, and wrap it with langfuse to capture detailed telemetry and usage data. Initialize Langfuse with your API key, then create a LangfuseTracer to wrap your LiteLLM instance for automatic logging and monitoring.

PREREQUISITES

  • Python 3.8+
  • pip install litellm langfuse
  • Langfuse API key (sign up at https://langfuse.com)

Setup

Install the required packages litellm and langfuse via pip, and set your Langfuse API key as an environment variable.

bash
pip install litellm langfuse

Step by step

This example shows how to create a LiteLLM instance, wrap it with LangfuseTracer for telemetry, and run a simple prompt. The Langfuse client automatically logs requests and responses for monitoring.

python
import os
from litellm import LiteLLM
from langfuse import Langfuse, LangfuseTracer

# Set your Langfuse API key in environment variable LANGFUSE_API_KEY
langfuse = Langfuse(api_key=os.environ["LANGFUSE_API_KEY"])

# Initialize LiteLLM (local or remote model)
llm = LiteLLM(model_name="litellm/gpt4all-lora")

# Wrap LiteLLM with LangfuseTracer for telemetry
traced_llm = LangfuseTracer(llm=llm, langfuse=langfuse)

# Run a prompt
response = traced_llm("Write a short poem about AI.")
print("Response:\n", response)
output
Response:
 A short poem about AI:

In circuits deep where data flows,
A mind of code and logic grows.
Silent thinker, swift and bright,
Crafting dreams in endless night.

Common variations

  • Use different LiteLLM models by changing the model_name parameter.
  • Enable async calls if supported by LiteLLM for non-blocking usage.
  • Customize Langfuse telemetry by adding metadata or tags to LangfuseTracer.
python
import asyncio

async def async_example():
    llm = LiteLLM(model_name="litellm/gpt4all-lora")
    langfuse = Langfuse(api_key=os.environ["LANGFUSE_API_KEY"])
    traced_llm = LangfuseTracer(llm=llm, langfuse=langfuse)
    response = await traced_llm.acall("Explain Langfuse integration.")
    print("Async response:\n", response)

asyncio.run(async_example())
output
Async response:
 Langfuse integration with LiteLLM enables automatic telemetry capture for your AI calls.

Troubleshooting

  • If you see authentication errors, verify your LANGFUSE_API_KEY environment variable is set correctly.
  • If telemetry data is missing, ensure your LangfuseTracer wraps the LiteLLM instance before calling it.
  • For connection issues, check network access to Langfuse endpoints.

Key Takeaways

  • Wrap your LiteLLM instance with LangfuseTracer to enable automatic telemetry.
  • Set your Langfuse API key in the environment variable LANGFUSE_API_KEY before running.
  • Langfuse supports both synchronous and asynchronous LiteLLM calls for flexible integration.
Verified 2026-04 · litellm/gpt4all-lora
Verify ↗