Fireworks AI pricing
PREREQUISITES
Python 3.8+Fireworks AI API keypip install openai>=1.0
Setup
Install the openai Python package to interact with Fireworks AI's OpenAI-compatible API. Set your Fireworks AI API key as an environment variable for secure authentication.
- Install package:
pip install openai - Set environment variable:
export FIREWORKS_API_KEY='your_api_key'(Linux/macOS) orset FIREWORKS_API_KEY=your_api_key(Windows)
pip install openai Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
Use the OpenAI SDK with the Fireworks AI base URL to call the API. Replace os.environ["FIREWORKS_API_KEY"] with your environment variable. This example sends a chat completion request to the Fireworks AI Llama 3.3 70B Instruct model.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1")
response = client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Hello, Fireworks AI pricing details?"}]
)
print(response.choices[0].message.content) Fireworks AI pricing is usage-based, charged per token processed. For exact rates, visit https://fireworks.ai/pricing.
Common variations
You can switch models by changing the model parameter to other Fireworks AI models like accounts/fireworks/models/deepseek-r1. For asynchronous calls, use Python's asyncio with the OpenAI SDK's async client. Streaming responses are also supported by setting stream=True in the request.
import os
import asyncio
from openai import OpenAI
async def async_chat():
client = OpenAI(api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1")
stream = await client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Stream Fireworks AI pricing info."}],
stream=True
)
async for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
asyncio.run(async_chat()) Fireworks AI pricing is usage-based, charged per token processed. For exact rates, visit https://fireworks.ai/pricing.
Troubleshooting
- If you receive authentication errors, verify your
FIREWORKS_API_KEYenvironment variable is set correctly. - For HTTP 429 rate limit errors, reduce request frequency or check your Fireworks AI plan limits.
- If the model is not found, confirm the model name matches Fireworks AI's current offerings.
Key Takeaways
- Fireworks AI pricing is usage-based and charged per token processed via API calls.
- Use the OpenAI SDK with the Fireworks AI base URL and your API key for integration.
- Check Fireworks AI's official pricing page regularly as rates and models may change.