How to use Fireworks AI with OpenAI SDK
Quick answer
Use the OpenAI Python SDK with the base_url set to Fireworks AI's API endpoint and your Fireworks API key. Call client.chat.completions.create with the Fireworks model name like accounts/fireworks/models/llama-v3p3-70b-instruct to generate completions.
PREREQUISITES
Python 3.8+Fireworks AI API keypip install openai>=1.0
Setup
Install the official openai Python package (v1 or later) and set your Fireworks AI API key as an environment variable. Use the Fireworks AI OpenAI-compatible endpoint as the base_url.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows how to create a chat completion request using Fireworks AI with the OpenAI SDK. Replace FIREWORKS_API_KEY with your actual API key set in the environment.
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
response = client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Hello, Fireworks AI!"}]
)
print(response.choices[0].message.content) output
Hello, Fireworks AI! How can I assist you today?
Common variations
- Use other Fireworks AI models by changing the
modelparameter, e.g.,accounts/fireworks/models/deepseek-r1. - For asynchronous calls, use an async client pattern with
asyncioandawait. - Enable streaming by passing
stream=Truetochat.completions.createand iterating over the response.
import asyncio
from openai import OpenAI
async def main():
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
stream = await client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Stream a response."}],
stream=True
)
async for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
asyncio.run(main()) output
Streaming response text here...
Troubleshooting
- If you get authentication errors, verify your
FIREWORKS_API_KEYenvironment variable is set correctly. - Ensure the
base_urlis exactlyhttps://api.fireworks.ai/inference/v1. - If the model is not found, confirm you are using a valid Fireworks AI model name starting with
accounts/fireworks/models/.
Key Takeaways
- Use the OpenAI SDK with Fireworks AI by setting the base_url to Fireworks endpoint.
- Specify Fireworks model names fully, e.g., accounts/fireworks/models/llama-v3p3-70b-instruct.
- Set your Fireworks API key in the environment variable FIREWORKS_API_KEY.
- Streaming and async calls are supported with the OpenAI SDK pattern.
- Check model names and API key if you encounter errors.