Fireworks AI JSON mode
Quick answer
Use the
openai Python SDK with base_url set to Fireworks AI's endpoint and specify the model name starting with accounts/fireworks/models/. Send your prompt in messages and parse JSON tool calls from response.choices[0].message.tool_calls when finish_reason is tool_calls.PREREQUISITES
Python 3.8+Fireworks AI API keypip install openai>=1.0
Setup
Install the openai Python package and set your Fireworks AI API key as an environment variable. Use the Fireworks AI OpenAI-compatible endpoint for API calls.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
Use the OpenAI SDK with base_url set to Fireworks AI's API endpoint and your API key from environment variables. Call chat.completions.create with the Fireworks AI model and parse the JSON tool calls if present.
import os
from openai import OpenAI
import json
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
messages = [{"role": "user", "content": "Get me the weather forecast in JSON format for New York City."}]
response = client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=messages
)
print("Response text:", response.choices[0].message.content)
if response.choices[0].finish_reason == "tool_calls":
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
print("Tool call function name:", tool_call.function.name)
print("Tool call arguments:", args) output
Response text: Here is the weather forecast for New York City in JSON format.
Tool call function name: get_weather
Tool call arguments: {'location': 'New York City'} Common variations
- Use different Fireworks AI models by changing the
modelparameter, e.g.,accounts/fireworks/models/deepseek-r1. - Enable streaming by adding
stream=Truetochat.completions.createand iterating over the response. - Use asynchronous calls with
asyncioandawaitif your environment supports it.
import asyncio
from openai import OpenAI
async def async_chat():
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
messages = [{"role": "user", "content": "Tell me a joke in JSON format."}]
stream = await client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=messages,
stream=True
)
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
asyncio.run(async_chat()) output
Why did the programmer quit his job? Because he didn't get arrays!
Troubleshooting
- If you get authentication errors, verify your
FIREWORKS_API_KEYenvironment variable is set correctly. - If the model is not found, confirm you are using the correct Fireworks AI model name starting with
accounts/fireworks/models/. - For JSON parsing errors, ensure the tool call arguments are valid JSON strings before loading.
Key Takeaways
- Use the OpenAI SDK with
base_urlset to Fireworks AI's endpoint for JSON mode. - Parse
tool_callsfrom the response to handle JSON function calls. - Specify Fireworks AI models with the full model path starting with
accounts/fireworks/models/. - Streaming and async calls are supported with the same SDK pattern.
- Always set your API key in environment variables to avoid authentication issues.