How to beginner · 3 min read

How to use DeepSeek on Together AI

Quick answer
Use the openai Python SDK with the base_url set to Together AI's API endpoint and your TOGETHER_API_KEY. Call client.chat.completions.create with the DeepSeek model name and your messages to get completions.

PREREQUISITES

  • Python 3.8+
  • Together AI API key (set TOGETHER_API_KEY environment variable)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your Together AI API key as an environment variable.

  • Install SDK: pip install openai
  • Set environment variable: export TOGETHER_API_KEY='your_api_key' (Linux/macOS) or set TOGETHER_API_KEY=your_api_key (Windows)
bash
pip install openai

Step by step

Use the OpenAI-compatible SDK with Together AI's base URL and your API key. Specify the DeepSeek model (e.g., deepseek-chat) and provide chat messages. The example below sends a user prompt and prints the AI's response.

python
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://api.together.xyz/v1"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Explain retrieval-augmented generation (RAG) in simple terms."}]
)

print(response.choices[0].message.content)
output
Retrieval-augmented generation (RAG) is a technique where an AI model first searches a large database or documents to find relevant information, then uses that information to generate more accurate and informed responses.

Common variations

You can use streaming to receive partial responses as they are generated, or switch to other DeepSeek models like deepseek-reasoner for reasoning tasks. Here's how to enable streaming:

python
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["TOGETHER_API_KEY"],
    base_url="https://api.together.xyz/v1"
)

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Summarize the benefits of RAG."}],
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
output
Retrieval-augmented generation (RAG) improves AI responses by combining external knowledge retrieval with generation, making answers more accurate and up-to-date.

Troubleshooting

  • If you get authentication errors, verify your TOGETHER_API_KEY environment variable is set correctly.
  • If the model is not found, confirm you are using a valid DeepSeek model name like deepseek-chat or deepseek-reasoner.
  • For network issues, check your internet connection and that https://api.together.xyz/v1 is reachable.

Key Takeaways

  • Use the OpenAI-compatible openai SDK with Together AI's base_url to access DeepSeek models.
  • Set your Together AI API key in the TOGETHER_API_KEY environment variable for authentication.
  • Streaming responses enable real-time token-by-token output from DeepSeek models.
  • DeepSeek offers specialized models like deepseek-chat for chat and deepseek-reasoner for reasoning tasks.
  • Check environment variables and model names carefully to avoid common errors.
Verified 2026-04 · deepseek-chat, deepseek-reasoner
Verify ↗