High severity HTTP 400 intermediate · Fix: 5-10 min

InvalidRequestError

openai.InvalidRequestError (embedding input too long)

What this error means
OpenAI embedding requests fail when the input text exceeds the model's maximum token limit for embeddings.

Stack trace

traceback
openai.InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 9000 tokens. Please reduce the length of the input.
QUICK FIX
Add token count checks using the model's tokenizer and truncate inputs exceeding the max token limit before embedding calls.

Why it happens

OpenAI embedding models have a strict maximum token limit for input text. When the input text is too long, the API rejects the request with an InvalidRequestError indicating the token count exceeded the model's limit. This prevents processing inputs that are too large to embed.

Detection

Before sending embedding requests, measure the token count of the input text using a tokenizer compatible with the embedding model and log or assert if it exceeds the limit.

Causes & fixes

1

Input text length exceeds the embedding model's maximum token limit

✓ Fix

Truncate or split the input text into smaller chunks below the token limit before calling the embedding API.

2

Not accounting for tokenization differences causing underestimation of token count

✓ Fix

Use the same tokenizer as the embedding model (e.g., tiktoken for OpenAI models) to accurately count tokens before sending.

3

Batching multiple texts into one input string exceeding token limit

✓ Fix

Send each text separately or batch only within the token limit per request.

Code: broken vs fixed

Broken - triggers the error
python
from openai import OpenAI
client = OpenAI()

text = "A" * 9000  # Very long input exceeding token limit
response = client.embeddings.create(model="text-embedding-3-large", input=text)  # This line raises InvalidRequestError
print(response)
Fixed - works correctly
python
import os
from openai import OpenAI
import tiktoken

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

text = "A" * 9000  # Very long input

# Use tiktoken to count tokens accurately
tokenizer = tiktoken.encoding_for_model("text-embedding-3-large")
tokens = tokenizer.encode(text)
max_tokens = 8191

if len(tokens) > max_tokens:
    # Truncate to max tokens
    tokens = tokens[:max_tokens]
    text = tokenizer.decode(tokens)

response = client.embeddings.create(model="text-embedding-3-large", input=text)  # Fixed: input truncated
print(response)
Added token counting and truncation using tiktoken to ensure input does not exceed the model's max token limit, preventing the InvalidRequestError.

Workaround

Catch InvalidRequestError exceptions, then split the input text into smaller chunks under the token limit and retry embedding calls on each chunk separately.

Prevention

Integrate token counting with the exact tokenizer used by the embedding model in your preprocessing pipeline to enforce input length limits before API calls.

Python 3.9+ · openai >=1.0.0 · tested on 1.7.x
Verified 2026-04 · text-embedding-3-large, text-embedding-3-small
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.