Code beginner · 3 min read

How to call DeepSeek API in Python

Direct answer

Use the OpenAI-compatible openai Python SDK with your DEEPSEEK_API_KEY and base_url="https://api.deepseek.com" to call client.chat.completions.create() with model deepseek-chat.

Setup

Install

bash

pip install openai

Env vars

DEEPSEEK_API_KEY

Imports

python

from openai import OpenAI
import os

Examples

inHello, who won the 2024 Olympics?

outThe 2024 Summer Olympics were held in Paris, and the United States topped the medal count.

inExplain quantum computing in simple terms.

outQuantum computing uses quantum bits that can be both 0 and 1 at the same time, enabling faster problem solving for certain tasks.

inSummarize the latest AI trends.

outRecent AI trends include large multimodal models, improved reasoning capabilities, and wider adoption in industry applications.

Integration steps

Import OpenAI client and load your DeepSeek API key from environment variables.
Initialize the OpenAI client with api_key and base_url='https://api.deepseek.com'.
Build the messages list with user prompts as dictionaries with role and content.
Call client.chat.completions.create() with model deepseek-chat and the messages.
Extract the response text from response.choices[0].message.content.
Print or use the generated text in your application.

Full code

python

from openai import OpenAI
import os

# Load API key from environment
api_key = os.environ["DEEPSEEK_API_KEY"]

# Initialize DeepSeek client with base_url
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com")

# Prepare chat messages
messages = [
    {"role": "user", "content": "Explain the benefits of renewable energy."}
]

# Call DeepSeek chat completions endpoint
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages
)

# Extract and print the answer
answer = response.choices[0].message.content
print("DeepSeek response:", answer)

API trace

Request

json

{"model": "deepseek-chat", "messages": [{"role": "user", "content": "Explain the benefits of renewable energy."}]}

Response

json

{"choices": [{"message": {"content": "Renewable energy reduces greenhouse gas emissions..."}}], "usage": {"total_tokens": 75}}

Extractresponse.choices[0].message.content

Variants

Streaming response ›

python

from openai import OpenAI
import os

api_key = os.environ["DEEPSEEK_API_KEY"]
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com")

messages = [{"role": "user", "content": "Tell me a story about AI."}]

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.get('content', ''), end='')
print()

Async call with asyncio ›

python

import asyncio
from openai import OpenAI
import os

async def main():
    api_key = os.environ["DEEPSEEK_API_KEY"]
    client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com")
    messages = [{"role": "user", "content": "What is the future of AI?"}]
    response = await client.chat.completions.acreate(
        model="deepseek-chat",
        messages=messages
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Use alternative model deepseek-reasoner ›

python

from openai import OpenAI
import os

api_key = os.environ["DEEPSEEK_API_KEY"]
client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com")

messages = [{"role": "user", "content": "Solve this math problem: 123 * 456."}]

response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=messages
)

print(response.choices[0].message.content)

Performance

Latency~700ms for typical DeepSeek chat completion calls

Cost~$0.0015 per 500 tokens exchanged

Rate limitsTier 1: 600 requests per minute, 40,000 tokens per minute

Keep prompts concise to reduce token usage.
Use system messages sparingly to save tokens.
Batch multiple queries if possible to optimize throughput.

Approach	Latency	Cost/call	Best for
Standard call	~700ms	~$0.0015	General chat completions
Streaming	~700ms initial + incremental	~$0.0015	Real-time UI updates
Async call	~700ms	~$0.0015	Concurrent requests in async apps
deepseek-reasoner model	~900ms	~$0.002	Complex reasoning and math

✓

Quick tip

Always set base_url="https://api.deepseek.com" when initializing the OpenAI client to target DeepSeek's API endpoint.

⚠

Common mistake

Forgetting to specify the base_url parameter causes requests to go to OpenAI's servers instead of DeepSeek's API.

Verified 2026-04 · deepseek-chat, deepseek-reasoner

Verify ↗