How to beginner · 3 min read

How to use Mixtral on Fireworks AI

Quick answer

Use the openai Python SDK with the base_url set to Fireworks AI's endpoint and specify the model as accounts/fireworks/models/mixtral-8x7b-instruct. Call client.chat.completions.create() with your messages to interact with Mixtral on Fireworks AI.

PREREQUISITES

Python 3.8+
Fireworks AI API key
pip install openai>=1.0

Setup

Install the openai Python package and set your Fireworks AI API key as an environment variable. Use the Fireworks AI OpenAI-compatible endpoint for API calls.

bash

pip install openai>=1.0

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

Use the OpenAI client with base_url set to Fireworks AI's API endpoint. Specify the Mixtral model and send chat messages. The example below sends a prompt and prints the response.

python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["FIREWORKS_API_KEY"],
    base_url="https://api.fireworks.ai/inference/v1"
)

response = client.chat.completions.create(
    model="accounts/fireworks/models/mixtral-8x7b-instruct",
    messages=[{"role": "user", "content": "Explain retrieval-augmented generation (RAG) in simple terms."}]
)

print("Response:", response.choices[0].message.content)

output

Response: Retrieval-augmented generation (RAG) is a technique where an AI model retrieves relevant information from a large database or documents and then uses that information to generate accurate and informed responses.

Common variations

You can enable streaming to receive tokens as they are generated, or use async calls with an async client. You can also switch to other Fireworks AI models by changing the model parameter.

python

import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(
        api_key=os.environ["FIREWORKS_API_KEY"],
        base_url="https://api.fireworks.ai/inference/v1"
    )

    # Async streaming example
    stream = await client.chat.completions.create(
        model="accounts/fireworks/models/mixtral-8x7b-instruct",
        messages=[{"role": "user", "content": "Tell me a joke."}],
        stream=True
    )

    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())

output

Why did the AI go to school? Because it wanted to improve its neural networks!

Troubleshooting

If you get authentication errors, verify your FIREWORKS_API_KEY environment variable is set correctly.
For model not found errors, confirm the model name is exactly accounts/fireworks/models/mixtral-8x7b-instruct.
Network errors may require checking your internet connection or firewall settings.

Key Takeaways

Use the OpenAI SDK with Fireworks AI's base_url to access Mixtral models.
Specify the full model name starting with 'accounts/fireworks/models/' for Fireworks AI.
Streaming and async calls are supported for real-time token generation.
Always set your API key in the environment variable FIREWORKS_API_KEY.
Check exact model names and API endpoint URLs to avoid errors.

Verified 2026-04 · accounts/fireworks/models/mixtral-8x7b-instruct

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.