How to beginner · 3 min read

Fireworks AI JSON mode

Q: Fireworks AI JSON mode

Use the openai Python SDK with base_url set to Fireworks AI's endpoint and specify the model name starting with accounts/fireworks/models/. Send your prompt in messages and parse JSON tool calls from response.choices[0].message.tool_calls when finish_reason is tool_calls.

Quick answer

Use the openai Python SDK with base_url set to Fireworks AI's endpoint and specify the model name starting with accounts/fireworks/models/. Send your prompt in messages and parse JSON tool calls from response.choices[0].message.tool_calls when finish_reason is tool_calls.

PREREQUISITES

Python 3.8+
Fireworks AI API key
pip install openai>=1.0

Setup

Install the openai Python package and set your Fireworks AI API key as an environment variable. Use the Fireworks AI OpenAI-compatible endpoint for API calls.

bash

pip install openai

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

Use the OpenAI SDK with base_url set to Fireworks AI's API endpoint and your API key from environment variables. Call chat.completions.create with the Fireworks AI model and parse the JSON tool calls if present.

python

import os
from openai import OpenAI
import json

client = OpenAI(
    api_key=os.environ["FIREWORKS_API_KEY"],
    base_url="https://api.fireworks.ai/inference/v1"
)

messages = [{"role": "user", "content": "Get me the weather forecast in JSON format for New York City."}]

response = client.chat.completions.create(
    model="accounts/fireworks/models/llama-v3p3-70b-instruct",
    messages=messages
)

print("Response text:", response.choices[0].message.content)

if response.choices[0].finish_reason == "tool_calls":
    tool_call = response.choices[0].message.tool_calls[0]
    args = json.loads(tool_call.function.arguments)
    print("Tool call function name:", tool_call.function.name)
    print("Tool call arguments:", args)

output

Response text: Here is the weather forecast for New York City in JSON format.
Tool call function name: get_weather
Tool call arguments: {'location': 'New York City'}

Common variations

Use different Fireworks AI models by changing the model parameter, e.g., accounts/fireworks/models/deepseek-r1.
Enable streaming by adding stream=True to chat.completions.create and iterating over the response.
Use asynchronous calls with asyncio and await if your environment supports it.

python

import asyncio
from openai import OpenAI

async def async_chat():
    client = OpenAI(
        api_key=os.environ["FIREWORKS_API_KEY"],
        base_url="https://api.fireworks.ai/inference/v1"
    )

    messages = [{"role": "user", "content": "Tell me a joke in JSON format."}]

    stream = await client.chat.completions.create(
        model="accounts/fireworks/models/llama-v3p3-70b-instruct",
        messages=messages,
        stream=True
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)

asyncio.run(async_chat())

output

Why did the programmer quit his job? Because he didn't get arrays!

Troubleshooting

If you get authentication errors, verify your FIREWORKS_API_KEY environment variable is set correctly.
If the model is not found, confirm you are using the correct Fireworks AI model name starting with accounts/fireworks/models/.
For JSON parsing errors, ensure the tool call arguments are valid JSON strings before loading.

Key Takeaways

Use the OpenAI SDK with base_url set to Fireworks AI's endpoint for JSON mode.
Parse tool_calls from the response to handle JSON function calls.
Specify Fireworks AI models with the full model path starting with accounts/fireworks/models/.
Streaming and async calls are supported with the same SDK pattern.
Always set your API key in environment variables to avoid authentication issues.

Verified 2026-04 · accounts/fireworks/models/llama-v3p3-70b-instruct, accounts/fireworks/models/deepseek-r1

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.