Cerebras supported models
llama3.3-70b and llama3.1-8b via its OpenAI-compatible API. Use the OpenAI Python SDK with the base_url set to https://api.cerebras.ai/v1 and specify these model names in your requests.PREREQUISITES
Python 3.8+CEREBRAS_API_KEY environment variable setpip install openai>=1.0
Setup
Install the official openai Python package (v1+) and set your Cerebras API key as an environment variable.
- Run
pip install openaito install the SDK. - Set
CEREBRAS_API_KEYin your shell environment.
pip install openai Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
Use the OpenAI client with the Cerebras API endpoint and call supported models like llama3.3-70b. Below is a complete example to send a chat completion request.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["CEREBRAS_API_KEY"], base_url="https://api.cerebras.ai/v1")
response = client.chat.completions.create(
model="llama3.3-70b",
messages=[{"role": "user", "content": "Hello, Cerebras!"}]
)
print(response.choices[0].message.content) Hello, Cerebras! How can I assist you today?
Common variations
You can switch to the smaller model llama3.1-8b by changing the model parameter. The Cerebras API is fully OpenAI-compatible, so you can use streaming or async calls with the same client pattern.
import asyncio
import os
from openai import OpenAI
async def main():
client = OpenAI(api_key=os.environ["CEREBRAS_API_KEY"], base_url="https://api.cerebras.ai/v1")
# Async streaming example
stream = await client.chat.completions.acreate(
model="llama3.1-8b",
messages=[{"role": "user", "content": "Stream a greeting from Cerebras."}],
stream=True
)
async for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
asyncio.run(main()) Hello from Cerebras streaming model llama3.1-8b!
Troubleshooting
If you receive authentication errors, verify your CEREBRAS_API_KEY is correctly set in your environment. For model not found errors, confirm you are using a supported model name like llama3.3-70b or llama3.1-8b. Network issues may require checking your firewall or proxy settings.
Key Takeaways
- Use
llama3.3-70bandllama3.1-8bmodels with Cerebras via OpenAI-compatible API. - Set
base_url="https://api.cerebras.ai/v1"in theOpenAIclient to target Cerebras endpoints. - The Cerebras API supports streaming and async calls identical to OpenAI SDK patterns.
- Always keep your
CEREBRAS_API_KEYsecure and set in your environment variables. - Model availability and API details may change; verify at https://docs.cerebras.net/api.