How to beginner · 3 min read

How to use Together AI with OpenAI SDK

Quick answer
Use the OpenAI SDK with base_url="https://api.together.xyz/v1" and your TOGETHER_API_KEY to call Together AI models. Create chat completions with client.chat.completions.create() specifying the Together model name and messages.

PREREQUISITES

  • Python 3.8+
  • Together AI API key (set as TOGETHER_API_KEY environment variable)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your Together AI API key as an environment variable.

  • Install SDK: pip install openai
  • Set environment variable: export TOGETHER_API_KEY="your_api_key_here" (Linux/macOS) or setx TOGETHER_API_KEY "your_api_key_here" (Windows)
bash
pip install openai

Step by step

Use the OpenAI client with base_url set to Together AI's endpoint. Call chat.completions.create with the Together model and messages to get a chat response.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["TOGETHER_API_KEY"], base_url="https://api.together.xyz/v1")

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Hello from Together AI!"}]
)

print(response.choices[0].message.content)
output
Hello from Together AI! How can I assist you today?

Common variations

You can use streaming to receive tokens as they are generated, or switch models by changing the model parameter. Async usage is supported with async and await.

python
import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["TOGETHER_API_KEY"], base_url="https://api.together.xyz/v1")
    
    # Streaming example
    stream = client.chat.completions.create(
        model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
        messages=[{"role": "user", "content": "Stream a response from Together AI."}],
        stream=True
    )

    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())
output
Streaming a response from Together AI in real time...

Troubleshooting

  • If you get authentication errors, verify your TOGETHER_API_KEY environment variable is set correctly.
  • If the model is not found, confirm the model name matches Together AI's current offerings.
  • For network issues, check your internet connection and firewall settings.

Key Takeaways

  • Use the OpenAI SDK with base_url set to Together AI's API endpoint for integration.
  • Set your Together AI API key in the TOGETHER_API_KEY environment variable for authentication.
  • Streaming and async calls are supported for real-time and efficient usage.
  • Always verify model names and API keys to avoid common errors.
Verified 2026-04 · meta-llama/Llama-3.3-70B-Instruct-Turbo
Verify ↗