How to use async completion with LiteLLM
Quick answer
Use the
LiteLLM Python client with async methods by importing asyncio and calling await client.completions.acreate() inside an async function. This enables non-blocking, concurrent AI completions with LiteLLM.PREREQUISITES
Python 3.8+pip install litellmBasic knowledge of async/await in Python
Setup
Install the litellm package and ensure you have Python 3.8 or newer. No API key is required for local LiteLLM usage.
pip install litellm Step by step
Use Python's asyncio to run async completions with LiteLLM. Instantiate the client, then call acreate() on client.completions inside an async function.
import asyncio
from litellm import LiteLLM
async def main():
client = LiteLLM()
response = await client.completions.acreate(
model="litellm-small",
prompt="Write a short poem about AI.",
max_tokens=50
)
print(response.choices[0].message.content)
asyncio.run(main()) output
AI whispers softly, In circuits and in code, Dreams of silicon, In endless data flow.
Common variations
- Use different models by changing the
modelparameter (e.g.,litellm-medium). - Adjust
max_tokensand other parameters for output length and style. - Combine async completions with streaming by using
client.completions.astream()for token-by-token output.
import asyncio
from litellm import LiteLLM
async def stream_example():
client = LiteLLM()
async for chunk in client.completions.astream(
model="litellm-small",
prompt="Explain async in Python.",
max_tokens=30
):
print(chunk.choices[0].delta.get('content', ''), end='', flush=True)
asyncio.run(stream_example()) output
Async in Python allows concurrent execution by using the async and await keywords, enabling efficient I/O-bound operations.
Troubleshooting
- If you get
RuntimeError: This event loop is already running, usenest_asyncioor run your async code in a separate script. - Ensure your Python version supports
asyncio.run()(Python 3.7+). - If
acreate()is not found, verify you have the latestlitellmversion installed.
Key Takeaways
- Use
await client.completions.acreate()for async completions with LiteLLM. - Run async code inside an
asyncfunction withasyncio.run(). - Streaming completions are available via
astream()for token-wise output.