How to Intermediate · 3 min read

How to fine-tune model on Together AI

Quick answer
Together AI supports fine-tuning via its OpenAI-compatible API using the openai Python SDK. Upload your training data as a file, then create a fine-tuning job with client.fine_tuning.jobs.create(), and finally use the fine-tuned model for chat completions.

PREREQUISITES

  • Python 3.8+
  • Together AI API key
  • pip install openai>=1.0

Setup

Install the openai Python package and set your Together AI API key as an environment variable. Use the Together AI API base URL to ensure requests go to their platform.

bash
pip install openai

Step by step

This example shows how to upload a training file, create a fine-tuning job, monitor its status, and use the fine-tuned model for chat completions on Together AI.

python
import os
import time
from openai import OpenAI

# Initialize client with Together AI base URL
client = OpenAI(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://api.together.xyz/v1"
)

# Step 1: Upload training file (JSONL format with messages)
with open("training_data.jsonl", "rb") as f:
    training_file = client.files.create(file=f, purpose="fine-tune")

print(f"Uploaded file ID: {training_file.id}")

# Step 2: Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file=training_file.id,
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo"
)

print(f"Started fine-tuning job ID: {job.id}")

# Step 3: Poll job status until done
while True:
    status = client.fine_tuning.jobs.retrieve(job.id)
    print(f"Job status: {status.status}")
    if status.status in ["succeeded", "failed"]:
        break
    time.sleep(30)

if status.status == "succeeded":
    fine_tuned_model = status.fine_tuned_model
    print(f"Fine-tuned model ready: {fine_tuned_model}")

    # Step 4: Use fine-tuned model for chat
    response = client.chat.completions.create(
        model=fine_tuned_model,
        messages=[{"role": "user", "content": "Hello, how are you?"}]
    )
    print("Response:", response.choices[0].message.content)
else:
    print("Fine-tuning failed.")
output
Uploaded file ID: file-abc123xyz
Started fine-tuning job ID: job-xyz789abc
Job status: running
Job status: running
Job status: succeeded
Fine-tuned model ready: meta-llama/Llama-3.3-70B-Instruct-Turbo-ft-2026-04
Response: Hello! I'm your fine-tuned assistant. How can I help you today?

Common variations

  • Use async calls with asyncio and await for non-blocking fine-tuning monitoring.
  • Change model parameter to other Together AI base models like meta-llama/Llama-3.3-70B-Instruct-Turbo or meta-llama/Llama-3.1-8B-Instruct-Turbo.
  • Stream chat completions by setting stream=True in client.chat.completions.create().
python
import asyncio

async def async_fine_tune():
    client = OpenAI(api_key=os.environ["TOGETHER_API_KEY"], base_url="https://api.together.xyz/v1")

    # Upload file asynchronously
    with open("training_data.jsonl", "rb") as f:
        training_file = await client.files.acreate(file=f, purpose="fine-tune")

    job = await client.fine_tuning.jobs.acreate(
        training_file=training_file.id,
        model="meta-llama/Llama-3.3-70B-Instruct-Turbo"
    )

    while True:
        status = await client.fine_tuning.jobs.aretrieve(job.id)
        print(f"Job status: {status.status}")
        if status.status in ["succeeded", "failed"]:
            break
        await asyncio.sleep(30)

    if status.status == "succeeded":
        fine_tuned_model = status.fine_tuned_model
        print(f"Fine-tuned model ready: {fine_tuned_model}")

        # Stream chat completion
        stream = client.chat.completions.create(
            model=fine_tuned_model,
            messages=[{"role": "user", "content": "Hello"}],
            stream=True
        )
        for chunk in stream:
            print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(async_fine_tune())
output
Job status: running
Job status: running
Job status: succeeded
Fine-tuned model ready: meta-llama/Llama-3.3-70B-Instruct-Turbo-ft-2026-04
Hello! I'm your fine-tuned assistant.

Troubleshooting

  • If you get authentication errors, verify your TOGETHER_API_KEY environment variable is set correctly.
  • Ensure your training file is in valid JSONL format with proper message structure.
  • If fine-tuning job fails, check the job status details for error messages and adjust training data accordingly.
  • Use appropriate model names supported by Together AI; unsupported models will cause job creation errors.

Key Takeaways

  • Use the OpenAI-compatible openai SDK with Together AI's base URL for fine-tuning.
  • Upload your training data file and create a fine-tuning job with client.fine_tuning.jobs.create().
  • Poll the job status until completion, then use the fine-tuned model for chat completions.
  • Async and streaming variants improve responsiveness and user experience.
  • Validate your training data format and API key to avoid common errors.
Verified 2026-04 · meta-llama/Llama-3.3-70B-Instruct-Turbo, meta-llama/Llama-3.1-8B-Instruct-Turbo
Verify ↗