How to beginner · 3 min read

Cerebras for code generation

Quick answer
Use the OpenAI SDK with the Cerebras API by setting base_url="https://api.cerebras.ai/v1" and your API key. Call client.chat.completions.create() with a code generation prompt and a model like llama3.1-8b or llama3.3-70b to generate code.

PREREQUISITES

  • Python 3.8+
  • CEREBRAS_API_KEY environment variable set
  • pip install openai>=1.0

Setup

Install the official openai Python package and set your Cerebras API key as an environment variable.

  • Install the SDK: pip install openai
  • Set environment variable: export CEREBRAS_API_KEY='your_api_key' (Linux/macOS) or set CEREBRAS_API_KEY=your_api_key (Windows)
bash
pip install openai
output
Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example shows how to generate Python code using the Cerebras API with the llama3.1-8b model. It sends a prompt asking for a Python function and prints the generated code.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["CEREBRAS_API_KEY"], base_url="https://api.cerebras.ai/v1")

messages = [
    {"role": "user", "content": "Write a Python function to compute Fibonacci numbers."}
]

response = client.chat.completions.create(
    model="llama3.1-8b",
    messages=messages
)

print("Generated code:\n", response.choices[0].message.content)
output
Generated code:
 def fibonacci(n):
     if n <= 0:
         return 0
     elif n == 1:
         return 1
     else:
         return fibonacci(n-1) + fibonacci(n-2)

Common variations

You can use larger models like llama3.3-70b for more complex code generation. The SDK supports streaming responses for real-time output. Use stream=True in chat.completions.create() and iterate over the stream.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["CEREBRAS_API_KEY"], base_url="https://api.cerebras.ai/v1")

messages = [
    {"role": "user", "content": "Generate a Python class for a linked list."}
]

stream = client.chat.completions.create(
    model="llama3.3-70b",
    messages=messages,
    stream=True
)

print("Streaming generated code:")
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
output
Streaming generated code:
class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self):
        self.head = None

    def append(self, data):
        new_node = Node(data)
        if not self.head:
            self.head = new_node
        else:
            current = self.head
            while current.next:
                current = current.next
            current.next = new_node

    def display(self):
        current = self.head
        while current:
            print(current.data, end=" -> ")
            current = current.next
        print("None")

Troubleshooting

  • If you get authentication errors, verify your CEREBRAS_API_KEY environment variable is set correctly.
  • For network errors, check your internet connection and the base_url is https://api.cerebras.ai/v1.
  • If the model is not found, confirm you are using a valid Cerebras model name like llama3.1-8b or llama3.3-70b.

Key Takeaways

  • Use the OpenAI SDK with base_url="https://api.cerebras.ai/v1" to access Cerebras models.
  • Models like llama3.1-8b and llama3.3-70b are suitable for code generation tasks.
  • Enable streaming with stream=True for real-time code output.
  • Always set your API key in the CEREBRAS_API_KEY environment variable.
  • Check model names and network settings if you encounter errors.
Verified 2026-04 · llama3.1-8b, llama3.3-70b
Verify ↗