Code beginner · 3 min read

How to call Mistral API in Python

Direct answer
Use the mistralai Python SDK by importing Mistral, initializing the client with your API key from os.environ, and calling client.chat.complete() with your model and messages.

Setup

Install
bash
pip install mistralai
Env vars
MISTRAL_API_KEY
Imports
python
from mistralai import Mistral
import os

Examples

inHello, how are you?
outI'm doing great, thank you! How can I assist you today?
inExplain the benefits of using Mistral API.
outMistral API offers fast, reliable, and state-of-the-art large language models with easy integration and competitive pricing.
in
outPlease provide a prompt to generate a response.

Integration steps

  1. Install the mistralai package via pip.
  2. Set your MISTRAL_API_KEY environment variable securely.
  3. Import Mistral and os modules in your Python script.
  4. Initialize the Mistral client with the API key from os.environ.
  5. Create a messages list with role and content for the chat completion.
  6. Call client.chat.complete() with the model and messages.
  7. Extract and use the response from response.choices[0].message.content.

Full code

python
from mistralai import Mistral
import os

# Initialize client with API key from environment
client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

# Prepare chat messages
messages = [{"role": "user", "content": "Hello, how are you?"}]

# Call the chat completion endpoint
response = client.chat.complete(
    model="mistral-large-latest",
    messages=messages
)

# Extract and print the response text
print("Response:", response.choices[0].message.content)
output
Response: I'm doing great, thank you! How can I assist you today?

API trace

Request
json
{"model": "mistral-large-latest", "messages": [{"role": "user", "content": "Hello, how are you?"}]}
Response
json
{"choices": [{"message": {"role": "assistant", "content": "I'm doing great, thank you! How can I assist you today?"}}], "usage": {"prompt_tokens": 10, "completion_tokens": 15, "total_tokens": 25}}
Extractresponse.choices[0].message.content

Variants

Streaming response

Use streaming to display partial responses in real-time for better user experience with long outputs.

python
from mistralai import Mistral
import os

client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

messages = [{"role": "user", "content": "Tell me a story."}]

# Streaming chat completion
for chunk in client.chat.stream_complete(model="mistral-large-latest", messages=messages):
    print(chunk.choices[0].delta.get("content", ""), end="", flush=True)
print()
Async call

Use async calls to integrate Mistral API in asynchronous Python applications for concurrency.

python
import asyncio
from mistralai import Mistral
import os

async def main():
    client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
    messages = [{"role": "user", "content": "What is AI?"}]
    response = await client.chat.complete_async(model="mistral-large-latest", messages=messages)
    print("Async response:", response.choices[0].message.content)

asyncio.run(main())
Use smaller model for faster response

Use the smaller model for quicker responses with lower cost when high detail is not required.

python
from mistralai import Mistral
import os

client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
messages = [{"role": "user", "content": "Summarize the latest news."}]
response = client.chat.complete(model="mistral-small-latest", messages=messages)
print("Summary:", response.choices[0].message.content)

Performance

Latency~700ms for <code>mistral-large-latest</code> non-streaming calls
Cost~$0.0015 per 1000 tokens for <code>mistral-large-latest</code>
Rate limitsDefault tier: 300 requests per minute, 60,000 tokens per minute
  • Use concise prompts to reduce token usage.
  • Prefer smaller models for less critical tasks to save cost.
  • Cache frequent queries to avoid repeated calls.
ApproachLatencyCost/callBest for
Standard call~700ms~$0.0015/1k tokensGeneral purpose, balanced speed and cost
Streaming responseStarts within 300msSame as standardLong outputs, better UX
Async call~700ms~$0.0015/1k tokensConcurrent applications
Smaller model~300ms~$0.0005/1k tokensFaster, cheaper, less detailed

Quick tip

Always set your API key in the environment variable <code>MISTRAL_API_KEY</code> and never hardcode it in your code.

Common mistake

Beginners often forget to pass the <code>messages</code> parameter as a list of dicts with roles, causing API errors.

Verified 2026-04 · mistral-large-latest, mistral-small-latest
Verify ↗