How to beginner · 3 min read

How to call Mistral with LiteLLM

Quick answer
Use the litellm Python package to call Mistral models by specifying the model name like mistral-large-latest when creating an LLM instance. Then invoke generate with your prompt to get completions from Mistral via LiteLLM.

PREREQUISITES

  • Python 3.8+
  • pip install litellm
  • Access to a LiteLLM server running Mistral models or configured remote endpoint

Setup

Install the litellm Python package and ensure you have access to a LiteLLM server hosting Mistral models. You can run LiteLLM locally or connect to a remote server.

bash
pip install litellm

Step by step

This example shows how to call the mistral-large-latest model using litellm. It creates an LLM instance, sends a prompt, and prints the generated text.

python
from litellm import LLM

# Initialize the LLM with the Mistral model
llm = LLM(model="mistral-large-latest")

# Define your prompt
prompt = "Write a short poem about spring."

# Generate completion
outputs = llm.generate([prompt])

# Print the first output text
print(outputs[0].outputs[0].text)
output
Write a short poem about spring.

Spring awakens with gentle breeze,
Blossoms dance on budding trees.
Sunlight warms the earth anew,
Life returns in vibrant hue.

Common variations

  • Async calls: Use generate with await in an async function.
  • Different models: Replace mistral-large-latest with other Mistral variants like mistral-small-latest.
  • Custom server: Specify base_url in LLM if connecting to a remote LiteLLM server.
python
import asyncio
from litellm import LLM

async def async_example():
    llm = LLM(model="mistral-large-latest")
    prompt = "Explain quantum computing in simple terms."
    outputs = await llm.generate([prompt])
    print(outputs[0].outputs[0].text)

asyncio.run(async_example())
output
Quantum computing uses quantum bits, or qubits, which can be both 0 and 1 at the same time, allowing computers to solve certain problems much faster than classical computers.

Troubleshooting

  • If you get connection errors, verify your LiteLLM server is running and accessible.
  • For model not found errors, confirm the model name is correct and available on your LiteLLM instance.
  • Check your Python environment and litellm installation if import errors occur.

Key Takeaways

  • Use the litellm package to easily call Mistral models with simple Python code.
  • Specify the Mistral model name when creating the LLM instance to target different model sizes.
  • Async and custom server options provide flexibility for advanced LiteLLM usage.
  • Ensure your LiteLLM server is running and accessible to avoid connection issues.
  • Keep litellm updated to support the latest Mistral models and features.
Verified 2026-04 · mistral-large-latest, mistral-small-latest
Verify ↗