High severity intermediate · Fix: 2-5 min

OpenAIError

openai.OpenAIError (embedding batch size too large)

What this error means

The OpenAI embeddings API rejects requests when the batch size exceeds the allowed maximum, causing an error.

Stack trace

traceback

openai.OpenAIError: The request batch size is too large. Maximum allowed batch size is 2048 tokens or 2048 items depending on the model.

QUICK FIX

Reduce your embeddings input batch size to below the model's maximum allowed limit, typically 2048 items or tokens.

Why it happens

The embeddings API enforces strict limits on the number of input texts or tokens processed in a single batch request. Sending too many inputs at once exceeds these limits, triggering an API error to protect service stability.

Detection

Monitor API error responses for OpenAIError messages indicating batch size limits exceeded, and log batch sizes before sending requests to catch oversize batches early.

Causes & fixes

Sending more input texts in one embeddings request than the model's maximum batch size limit.

✓ Fix

Split your input list into smaller batches that respect the model's documented maximum batch size before calling the embeddings API.

Ignoring token count limits per batch when inputs are very long texts.

✓ Fix

Calculate the total token count of inputs in a batch and reduce batch size if the token count exceeds the model's maximum allowed tokens per request.

Using an outdated or incorrect batch size assumption not aligned with the current API limits.

✓ Fix

Check the latest OpenAI embeddings API documentation for current batch size limits and update your batching logic accordingly.

Code: broken vs fixed

Broken - triggers the error

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])
inputs = ['text1', 'text2', ..., 'text5000']  # Too large batch
response = client.embeddings.create(model='text-embedding-3-large', input=inputs)  # This line triggers the error

Fixed - works correctly

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])
inputs = ['text1', 'text2', ..., 'text5000']

# Split inputs into batches of max 1000 to avoid batch size error
batch_size = 1000
embeddings = []
for i in range(0, len(inputs), batch_size):
    batch = inputs[i:i+batch_size]
    response = client.embeddings.create(model='text-embedding-3-large', input=batch)  # Fixed batch size
    embeddings.extend(response.data)

print(f"Processed {len(embeddings)} embeddings successfully.")

Added batching logic to split inputs into smaller chunks under the API batch size limit, preventing the embedding batch size too large error.

⚠

Workaround

Catch the OpenAIError exception, then retry the embeddings request with smaller batches by slicing the input list dynamically.

✓

Prevention

Implement automatic batching in your embedding pipeline that respects the model's documented batch size and token limits, and monitor API error responses to adjust batch sizes proactively.

Python 3.9+ · openai >=1.0.0 · tested on 1.5.x

Verified 2026-04

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.