How to beginner · 3 min read

How to use DeepSeek on Together AI

Q: How to use DeepSeek on Together AI

Use the openai Python SDK with the base_url set to Together AI's API endpoint and your TOGETHER_API_KEY. Call client.chat.completions.create with the DeepSeek model name and your messages to get completions.

Quick answer

Use the openai Python SDK with the base_url set to Together AI's API endpoint and your TOGETHER_API_KEY. Call client.chat.completions.create with the DeepSeek model name and your messages to get completions.

PREREQUISITES

Python 3.8+
Together AI API key (set TOGETHER_API_KEY environment variable)
pip install openai>=1.0

Setup

Install the openai Python package and set your Together AI API key as an environment variable.

Install SDK: pip install openai
Set environment variable: export TOGETHER_API_KEY='your_api_key' (Linux/macOS) or set TOGETHER_API_KEY=your_api_key (Windows)

bash

pip install openai

Step by step

Use the OpenAI-compatible SDK with Together AI's base URL and your API key. Specify the DeepSeek model (e.g., deepseek-chat) and provide chat messages. The example below sends a user prompt and prints the AI's response.

python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://api.together.xyz/v1"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Explain retrieval-augmented generation (RAG) in simple terms."}]
)

print(response.choices[0].message.content)

output

Retrieval-augmented generation (RAG) is a technique where an AI model first searches a large database or documents to find relevant information, then uses that information to generate more accurate and informed responses.

Common variations

You can use streaming to receive partial responses as they are generated, or switch to other DeepSeek models like deepseek-reasoner for reasoning tasks. Here's how to enable streaming:

python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://api.together.xyz/v1"
)

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Summarize the benefits of RAG."}],
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

output

Retrieval-augmented generation (RAG) improves AI responses by combining external knowledge retrieval with generation, making answers more accurate and up-to-date.

Troubleshooting

If you get authentication errors, verify your TOGETHER_API_KEY environment variable is set correctly.
If the model is not found, confirm you are using a valid DeepSeek model name like deepseek-chat or deepseek-reasoner.
For network issues, check your internet connection and that https://api.together.xyz/v1 is reachable.

✅

Key Takeaways

Use the OpenAI-compatible openai SDK with Together AI's base_url to access DeepSeek models.
Set your Together AI API key in the TOGETHER_API_KEY environment variable for authentication.
Streaming responses enable real-time token-by-token output from DeepSeek models.
DeepSeek offers specialized models like deepseek-chat for chat and deepseek-reasoner for reasoning tasks.
Check environment variables and model names carefully to avoid common errors.

Verified 2026-04 · deepseek-chat, deepseek-reasoner

Verify ↗