TypeError
TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens'
Stack trace
TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens'
File "/path/to/your/script.py", line 42, in <module>
response = client.chat.completions.create(
model="o1",
max_tokens=2000, # ← This parameter is not allowed for reasoning models
messages=messages
)
TypeError: ChatCompletion.create() got an unexpected keyword argument 'max_tokens' Why it happens
OpenAI's reasoning models (o1, o3, o3-mini) use a different parameter name for output length control because they spend significant compute on reasoning before generating output. The max_tokens parameter from standard chat models (gpt-4o, gpt-4-mini) was deprecated in favor of max_completion_tokens for reasoning models to clarify that this limit applies only to the completion phase, not the internal reasoning budget. Using the old parameter name raises a TypeError because the API explicitly rejects it for these models.
Detection
Add explicit parameter validation when calling o1/o3 models: if you're passing max_tokens to client.chat.completions.create(), your code will fail immediately with TypeError. Test your reasoning model calls against a live API key before deploying to catch this instantly.
Causes & fixes
Using max_tokens parameter with o1 or o3 model (legacy parameter name from standard chat models)
Replace max_tokens with max_completion_tokens. Example: max_completion_tokens=2000 instead of max_tokens=2000
Copying code from gpt-4o examples without checking reasoning model documentation
Review OpenAI reasoning model docs at platform.openai.com/docs/guides/reasoning. Reasoning models have a different parameter interface than standard chat models.
Not setting any output limit and assuming max_tokens will work as default
For o1/o3 models, always explicitly use max_completion_tokens. Set it based on your use case: min 1000 for simple tasks, 10000+ for complex reasoning.
Using an outdated LangChain or wrapper library that passes max_tokens to o1/o3 without translation
Upgrade to langchain>=0.3.0 and anthropic-sdk>=0.20.0. Use model=o1 in LangChain ChatOpenAI: it auto-translates max_tokens → max_completion_tokens.
Code: broken vs fixed
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
messages = [
{"role": "user", "content": "Solve this math problem: what is 17 * 24?"}
]
# ❌ BROKEN: max_tokens is not allowed for o1/o3 models
response = client.chat.completions.create(
model="o1",
max_tokens=2000, # ← TypeError: unexpected keyword argument
messages=messages
)
print(response.choices[0].message.content) import os
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
messages = [
{"role": "user", "content": "Solve this math problem: what is 17 * 24?"}
]
# ✅ FIXED: Use max_completion_tokens for o1/o3 reasoning models
response = client.chat.completions.create(
model="o1",
max_completion_tokens=2000, # ← Changed from max_tokens to max_completion_tokens
messages=messages
)
print(response.choices[0].message.content) Workaround
If you need to support both standard chat models (gpt-4o) and reasoning models (o1) in the same codebase, create a conditional wrapper: if model.startswith('o1') or model.startswith('o3'), use max_completion_tokens; otherwise use max_tokens. This allows gradual migration without rewriting all call sites.
Prevention
Build a model-aware API wrapper that abstracts parameter differences between model families. Example: def create_completion(model, max_output=2000): if model in ['o1', 'o3', 'o3-mini']: return client.chat.completions.create(model=model, max_completion_tokens=max_output, ...) else: return client.chat.completions.create(model=model, max_tokens=max_output, ...). Test all model families in your test suite explicitly.