How to use LiteLLM proxy with OpenAI SDK
Quick answer
Use the
OpenAI SDK with the base_url parameter set to your LiteLLM proxy endpoint to route requests through LiteLLM. This enables you to send chat completions to LiteLLM transparently using the standard client.chat.completions.create() method.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0LiteLLM proxy running and accessible
Setup
Install the official OpenAI Python SDK and ensure your LiteLLM proxy server is running and reachable. Set your OpenAI API key as an environment variable.
- Install OpenAI SDK:
pip install openai - Export your API key:
export OPENAI_API_KEY='your_api_key' - Have LiteLLM proxy URL ready, e.g.,
http://localhost:11434/v1
pip install openai Step by step
Use the OpenAI client with the base_url parameter pointed to your LiteLLM proxy URL. Then call the chat completions endpoint as usual.
import os
from openai import OpenAI
# Initialize OpenAI client with LiteLLM proxy base URL
client = OpenAI(
api_key=os.environ["OPENAI_API_KEY"],
base_url="http://localhost:11434/v1"
)
# Create a chat completion request
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from LiteLLM proxy!"}]
)
print(response.choices[0].message.content) output
Hello from LiteLLM proxy! How can I assist you today?
Common variations
You can use different models supported by LiteLLM by changing the model parameter. For asynchronous calls, use Python's asyncio with the OpenAI SDK's async client. Streaming responses are also supported by passing stream=True in the request.
import os
import asyncio
from openai import OpenAI
async def async_chat():
client = OpenAI(
api_key=os.environ["OPENAI_API_KEY"],
base_url="http://localhost:11434/v1"
)
# Async chat completion with streaming
response = await client.chat.completions.acreate(
model="gpt-4o",
messages=[{"role": "user", "content": "Stream response via LiteLLM proxy."}],
stream=True
)
async for chunk in response:
print(chunk.choices[0].delta.get("content", ""), end="", flush=True)
asyncio.run(async_chat()) output
Stream response via LiteLLM proxy.
Troubleshooting
- If you get connection errors, verify the LiteLLM proxy URL and that the proxy server is running.
- Ensure your
OPENAI_API_KEYenvironment variable is set correctly even though LiteLLM may not require it; the SDK mandates it. - Check that the LiteLLM proxy supports the model you specify.
- For SSL errors, confirm if your proxy uses HTTPS and adjust
base_urlaccordingly.
Key Takeaways
- Set the OpenAI SDK's
base_urlto your LiteLLM proxy endpoint to route requests. - Use the standard
client.chat.completions.create()method with LiteLLM transparently. - Support for async and streaming calls works by using the SDK's async methods with the proxy.
- Always verify proxy URL and API key environment variables to avoid connection issues.