High severity beginner · Fix: 2-5 min

TypeError

TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens'

What this error means
OpenAI's o1 and o3 reasoning models do not accept the max_tokens parameter; use max_completion_tokens instead to control output length.

Stack trace

traceback
TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens'

File "/path/to/your/script.py", line 42, in <module>
  response = client.chat.completions.create(
    model="o1",
    max_tokens=2000,  # ← This parameter is not allowed for reasoning models
    messages=messages
  )
TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens'
QUICK FIX
Replace max_tokens=<value> with max_completion_tokens=<value> in your client.chat.completions.create() call for o1/o3 models.

Why it happens

OpenAI's reasoning models (o1, o3, o3-mini) use a different parameter name for output length control because they spend significant compute on reasoning before generating output. The max_tokens parameter from standard chat models (gpt-4o, gpt-4-mini) was deprecated in favor of max_completion_tokens for reasoning models to clarify that this limit applies only to the completion phase, not the internal reasoning budget. Using the old parameter name raises a TypeError because the API explicitly rejects it for these models.

Detection

Add explicit parameter validation when calling o1/o3 models: if you're passing max_tokens to client.chat.completions.create(), your code will fail immediately with TypeError. Test your reasoning model calls against a live API key before deploying to catch this instantly.

Causes & fixes

1

Using max_tokens parameter with o1 or o3 model (legacy parameter name from standard chat models)

✓ Fix

Replace max_tokens with max_completion_tokens. Example: max_completion_tokens=2000 instead of max_tokens=2000

2

Copying code from gpt-4o examples without checking reasoning model documentation

✓ Fix

Review OpenAI reasoning model docs at platform.openai.com/docs/guides/reasoning. Reasoning models have a different parameter interface than standard chat models.

3

Not setting any output limit and assuming max_tokens will work as default

✓ Fix

For o1/o3 models, always explicitly use max_completion_tokens. Set it based on your use case: min 1000 for simple tasks, 10000+ for complex reasoning.

4

Using an outdated LangChain or wrapper library that passes max_tokens to o1/o3 without translation

✓ Fix

Upgrade to langchain>=0.3.0 and anthropic-sdk>=0.20.0. Use model=o1 in LangChain ChatOpenAI: it auto-translates max_tokens → max_completion_tokens.

Code: broken vs fixed

Broken - triggers the error
python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

messages = [
    {"role": "user", "content": "Solve this math problem: what is 17 * 24?"}
]

# ❌ BROKEN: max_tokens is not allowed for o1/o3 models
response = client.chat.completions.create(
    model="o1",
    max_tokens=2000,  # ← TypeError: unexpected keyword argument
    messages=messages
)

print(response.choices[0].message.content)
Fixed - works correctly
python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

messages = [
    {"role": "user", "content": "Solve this math problem: what is 17 * 24?"}
]

# ✅ FIXED: Use max_completion_tokens for o1/o3 reasoning models
response = client.chat.completions.create(
    model="o1",
    max_completion_tokens=2000,  # ← Changed from max_tokens to max_completion_tokens
    messages=messages
)

print(response.choices[0].message.content)
Changed max_tokens to max_completion_tokens, which is the correct parameter name for o1/o3 reasoning models in OpenAI SDK v1+. Reasoning models require this parameter because they split compute between internal reasoning and final output generation.

Workaround

If you need to support both standard chat models (gpt-4o) and reasoning models (o1) in the same codebase, create a conditional wrapper: if model.startswith('o1') or model.startswith('o3'), use max_completion_tokens; otherwise use max_tokens. This allows gradual migration without rewriting all call sites.

Prevention

Build a model-aware API wrapper that abstracts parameter differences between model families. Example: def create_completion(model, max_output=2000): if model in ['o1', 'o3', 'o3-mini']: return client.chat.completions.create(model=model, max_completion_tokens=max_output, ...) else: return client.chat.completions.create(model=model, max_tokens=max_output, ...). Test all model families in your test suite explicitly.

Python 3.9+ · openai >=1.0.0 · tested on 1.59.0+ (April 2026)
Verified 2026-04 · o1, o3, o3-mini
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.