How to use Llama on Fireworks AI
Quick answer
Use the
openai Python SDK with your Fireworks AI API key and set the base_url to Fireworks' endpoint. Call client.chat.completions.create with the Llama model name accounts/fireworks/models/llama-v3p3-70b-instruct and your chat messages to get completions.PREREQUISITES
Python 3.8+Fireworks AI API keypip install openai>=1.0
Setup
Install the openai Python package and set your Fireworks AI API key as an environment variable. Use the Fireworks AI OpenAI-compatible endpoint for API calls.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
Use the OpenAI SDK with your Fireworks API key and specify the Fireworks base URL. Call the chat.completions.create method with the Llama model and your messages.
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
response = client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content) output
Hello! I'm your Llama model on Fireworks AI, ready to assist you.
Common variations
You can use other Fireworks Llama models by changing the model parameter. For asynchronous calls, use an async client pattern. Streaming responses are not currently supported on Fireworks AI.
import asyncio
from openai import OpenAI
async def main():
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
response = await client.chat.completions.acreate(
model="accounts/fireworks/models/llama-v3p3-70b-instruct",
messages=[{"role": "user", "content": "Tell me a joke."}]
)
print(response.choices[0].message.content)
asyncio.run(main()) output
Why did the computer show up at work late? Because it had a hard drive!
Troubleshooting
- If you get authentication errors, verify your
FIREWORKS_API_KEYenvironment variable is set correctly. - If the model is not found, confirm you are using the exact model name
accounts/fireworks/models/llama-v3p3-70b-instruct. - For network issues, check your internet connection and Fireworks API status.
Key Takeaways
- Use the OpenAI SDK with Fireworks AI by setting the base_url to Fireworks endpoint.
- Specify the full Fireworks Llama model name when calling chat completions.
- Set your Fireworks API key in the environment variable FIREWORKS_API_KEY.
- Async calls are supported; streaming is not currently available.
- Verify model names and API keys to avoid common errors.