Cheat Sheet beginner · 8 min read

DeepSeek API Cheat Sheet — Chat & Reasoning Models

version 1.0

DeepSeek's fastest reasoning + chat models

DEEPSEEK_API_KEY

install pip install openai

core imports

python

from openai import OpenAI
import os

Mental model

OpenAI-compatible API for DeepSeek's reasoning and chat models.

Like switching from ChatGPT to Claude in the OpenAI SDK: same API shape, different model behavior. Reasoning model is like getting the scratchpad before the final answer.

Core Patterns

01 Basic Chat (deepseek-chat V3)

Simple completions, multimodal, fast inference

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "user", "content": "What is 2+2?"}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

output 4

base_url must be https://api.deepseek.com: using default OpenAI URL will fail.

02 Reasoning Model (deepseek-reasoner R1)

Complex reasoning, math, logic problems requiring chain-of-thought

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[
        {"role": "user", "content": "Solve: If x + y = 10 and x - y = 2, find x and y"}
    ]
)

# Access thinking tokens
if hasattr(response.choices[0].message, 'reasoning_content'):
    print("Thinking:", response.choices[0].message.reasoning_content)
print("Answer:", response.choices[0].message.content)

Reasoning model returns thinking process in reasoning_content field: don't ignore it, it's the model's work product.

03 Streaming Responses

Real-time token output, long responses, reduce latency perception

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Write a haiku"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Streaming chunks have delta.content (not content). Always check if it exists before printing.

04 Vision (Image Input)

Analyze images, multimodal reasoning

python

from openai import OpenAI
import os
import base64

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

# Image from URL
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/image.jpg"}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

# Image from base64
with open("image.jpg", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this"},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

deepseek-reasoner does NOT support vision. Use deepseek-chat for images.

05 System Prompt & Role

Set behavior, style, constraints for the model

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {
            "role": "system",
            "content": "You are a Python expert. Answer only in Python code blocks."
        },
        {
            "role": "user",
            "content": "How do I read a JSON file?"
        }
    ],
    temperature=0.3
)

print(response.choices[0].message.content)

System message must be first in messages array with role='system'. Reasoner model may deprioritize system prompts.

06 Function Calling (Tool Use)

Structured outputs, API integration, tool invocation

python

from openai import OpenAI
import os
import json

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["C", "F"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools=tools,
    tool_choice="auto"
)

# Check if model wants to call a tool
if response.choices[0].message.tool_calls:
    for call in response.choices[0].message.tool_calls:
        print(f"Tool: {call.function.name}")
        print(f"Args: {call.function.arguments}")
else:
    print(response.choices[0].message.content)

tool_choice='auto' means the model decides. Use 'required' to force tool use, or 'none' to disable.

Common Request Parameters

deepseek-chat and deepseek-reasoner shared parameters

Parameter	Type	Default	Notes
`model`	string	required	deepseek-chat or deepseek-reasoner
`messages`	list	required	[{role, content}]: role: system\|user\|assistant
`temperature`	float	0.7	0.0–2.0. Lower = deterministic, higher = creative
`max_tokens`	int	4096	Max output length. Reasoning model may use more internally.
`top_p`	float	1.0	Nucleus sampling, 0.0–1.0. Use with temperature.
`frequency_penalty`	float	0.0	-2.0 to 2.0. Penalize repeated tokens.
`presence_penalty`	float	0.0	-2.0 to 2.0. Encourage new topics.
`stream`	bool	false	Enable streaming responses.
`tools`	list	null	Function definitions for tool calling.
`tool_choice`	string	auto	'auto' \| 'required' \| 'none': controls tool use.

Common Errors & Fixes

01 AuthenticationError: Invalid API key

Cause: DEEPSEEK_API_KEY env var missing, empty, or wrong. Or using OpenAI key instead.

Fix:

python

from openai import OpenAI
import os

# Check env var exists
api_key = os.environ.get("DEEPSEEK_API_KEY")
if not api_key:
    raise ValueError("DEEPSEEK_API_KEY not set")

client = OpenAI(
    api_key=api_key,
    base_url="https://api.deepseek.com"
)

# Or set in shell: export DEEPSEEK_API_KEY=sk-...

02 APIConnectionError: Invalid URL base_url

Cause: base_url is wrong. Using default OpenAI URL (https://api.openai.com) or typo.

Fix:

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"  # Must be exactly this
)

# Verify in code before making requests
print(client.base_url)  # Should print https://api.deepseek.com/

03 ValueError: model not found

Cause: Model name is invalid or deprecated. Using old name like deepseek-v2 or typo in deepseek-reasoner.

Fix:

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

# Valid models (2026-04)
response = client.chat.completions.create(
    model="deepseek-chat",  # Current: V3
    # OR
    # model="deepseek-reasoner",  # Current: R1
    messages=[{"role": "user", "content": "test"}]
)

04 KeyError: 'reasoning_content' when using deepseek-chat

Cause: Trying to access reasoning_content on chat model. Only deepseek-reasoner has thinking.

Fix:

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-reasoner",  # Switch to reasoner
    messages=[{"role": "user", "content": "Complex math problem"}]
)

# Safe access to thinking
if hasattr(response.choices[0].message, 'reasoning_content'):
    print(response.choices[0].message.reasoning_content)

05 IndexError: list index out of range on chunk.choices[0]

Cause: Streaming chunk has empty choices list (keepalive ping or malformed).

Fix:

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)

for chunk in stream:
    # Safe check before accessing
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content:
            print(delta.content, end="", flush=True)

06 BadRequestError: Reasoning model does not support tools/vision

Cause: Using deepseek-reasoner with tool_calls or vision. Reasoner has limited feature support.

Fix:

python

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com"
)

# For vision or tools: use deepseek-chat
response = client.chat.completions.create(
    model="deepseek-chat",  # Not deepseek-reasoner
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this"},
                {"type": "image_url", "image_url": {"url": "https://..."}}
            ]
        }
    ]
)

API Reference

Method / Property	Description	Returns
`client.chat.completions.create()`	Send a message to chat/reasoning model. Returns ChatCompletion object or stream iterator.	ChatCompletion \| Iterator[ChatCompletionChunk] (if stream=True)
`response.choices[0].message.content`	The model's text response. None if tool_calls or reasoning_content present.	str or None
`response.choices[0].message.reasoning_content`	Internal reasoning chain (deepseek-reasoner only). Shows the model's thinking process.	str (reasoning model) or AttributeError (chat model)
`response.choices[0].message.tool_calls`	List of tool invocations. Each has .function.name and .function.arguments (JSON string).	list[ToolCall] or None
`response.usage.total_tokens`	Total input + output tokens used in request.	int
`response.model`	The model name actually used (e.g., deepseek-chat).	str

Production Gotchas

⚠ base_url is critical: default URL will fail

DeepSeek API is not OpenAI. If you forget base_url='https://api.deepseek.com', the request goes to OpenAI and authenticates against the wrong API. Always set it explicitly and verify in logs.

⚠ deepseek-reasoner does not support vision, tools, or system prompts effectively

If you need images, function calling, or strong system role enforcement, use deepseek-chat. Reasoning model is optimized for math/logic chains-of-thought only. Don't force it into multimodal tasks.

⚠ Streaming with reasoning model exposes thinking gradually

When streaming deepseek-reasoner with stream=True, you get reasoning chunks first, then final answer. Your UI must handle this: show thinking, then replace with answer, or buffer until done.

⚠ max_tokens applies to output only, not reasoning overhead

For deepseek-reasoner, max_tokens limits the final answer. Internal reasoning can consume 5–10x more tokens. Set high max_tokens (8k–16k) if reasoning seems truncated. Costs are cumulative.

⚠ Tool arguments are JSON strings, not objects

When a tool_call is returned, .function.arguments is a JSON string. Parse it: json.loads(call.function.arguments). Don't pass it directly to your function: it will fail.

⚠ Rate limits: check response headers and back off exponentially

DeepSeek enforces rate limits (tokens/minute, requests/minute). RateLimitError has retry-after header. Implement exponential backoff: 1s → 2s → 4s → 8s.

⚠ Temperature + top_p together can cause unstable behavior

Using both temperature > 0.7 AND top_p < 0.9 simultaneously can produce erratic outputs. Pick one: either control temperature (simple) or top_p (nucleus sampling). Don't combine aggressively.

Verified 2026-04 · deepseek-chat, deepseek-reasoner

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.