How to beginner · 3 min read

How to use Together AI with OpenAI SDK

Q: How to use Together AI with OpenAI SDK

Use the OpenAI SDK with base_url="https://api.together.xyz/v1" and your TOGETHER_API_KEY to call Together AI models. Create chat completions with client.chat.completions.create() specifying the Together model name and messages.

Quick answer

Use the OpenAI SDK with base_url="https://api.together.xyz/v1" and your TOGETHER_API_KEY to call Together AI models. Create chat completions with client.chat.completions.create() specifying the Together model name and messages.

PREREQUISITES

Python 3.8+
Together AI API key (set as TOGETHER_API_KEY environment variable)
pip install openai>=1.0

Setup

Install the openai Python package and set your Together AI API key as an environment variable.

Install SDK: pip install openai
Set environment variable: export TOGETHER_API_KEY="your_api_key_here" (Linux/macOS) or setx TOGETHER_API_KEY "your_api_key_here" (Windows)

bash

pip install openai

Step by step

Use the OpenAI client with base_url set to Together AI's endpoint. Call chat.completions.create with the Together model and messages to get a chat response.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["TOGETHER_API_KEY"], base_url="https://api.together.xyz/v1")

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Hello from Together AI!"}]
)

print(response.choices[0].message.content)

output

Hello from Together AI! How can I assist you today?

Common variations

You can use streaming to receive tokens as they are generated, or switch models by changing the model parameter. Async usage is supported with async and await.

python

import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["TOGETHER_API_KEY"], base_url="https://api.together.xyz/v1")
    
    # Streaming example
    stream = client.chat.completions.create(
        model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
        messages=[{"role": "user", "content": "Stream a response from Together AI."}],
        stream=True
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())

output

Streaming a response from Together AI in real time...

Troubleshooting

If you get authentication errors, verify your TOGETHER_API_KEY environment variable is set correctly.
If the model is not found, confirm the model name matches Together AI's current offerings.
For network issues, check your internet connection and firewall settings.

✅

Key Takeaways

Use the OpenAI SDK with base_url set to Together AI's API endpoint for integration.
Set your Together AI API key in the TOGETHER_API_KEY environment variable for authentication.
Streaming and async calls are supported for real-time and efficient usage.
Always verify model names and API keys to avoid common errors.

Verified 2026-04 · meta-llama/Llama-3.3-70B-Instruct-Turbo

Verify ↗