High severity HTTP 413 beginner · Fix: 5-15 min

413 Request Entity Too Large or InvalidRequestError

openai.InvalidRequestError (HTTP 413 or API validation error)

What this error means

OpenAI's Whisper API rejects audio files larger than 25MB and returns an InvalidRequestError or 413 status code.

Stack trace

traceback

openai.APIStatusError: Error code: 413 -- {"error": {"message": "Request body too large", "type": "invalid_request_error", "param": null, "code": "request_too_large"}} or openai.InvalidRequestError: Error code: 400 -- {"error": {"message": "file size must be below 25 MB", "type": "invalid_request_error"}}

QUICK FIX

Compress your audio to 64-128 kbps using ffmpeg (ffmpeg -i input.wav -acodec libmp3lame -ab 128k output.mp3) before uploading, or implement pre-split logic to break files into 20-minute segments.

Why it happens

OpenAI's Whisper API (whisper-1 model) has a hard 25MB file size limit to control infrastructure costs and ensure reasonable latency. When you attempt to transcribe an audio file larger than this limit via client.audio.transcriptions.create(), the API rejects the request before processing. This is a documented API constraint, not a bug.

Detection

Check file size before uploading: if os.path.getsize(audio_file_path) > 25 * 1024 * 1024, you need compression or streaming. Add logging to track file sizes in production: logger.info(f'Audio file size: {os.path.getsize(path) / 1024 / 1024:.2f}MB').

Causes & fixes

Audio file is genuinely larger than 25MB without compression (e.g., WAV, uncompressed MP3, or long recording)

✓ Fix

Compress the audio file using ffmpeg before upload: ffmpeg -i input.wav -acodec libmp3lame -ab 64k output.mp3 reduces file size to ~30-40% with minimal quality loss at 64 kbps bitrate

Uploading a lossless or high-bitrate audio file (FLAC, WAV at 320kbps, or 48kHz PCM)

✓ Fix

Transcode to MP3 or M4A at 64-128 kbps bitrate using ffmpeg: ffmpeg -i input.flac -acodec libmp3lame -ab 128k output.mp3, which reduces 25MB+ files to <5MB

Audio file is split into multiple segments and each segment exceeds 25MB individually

✓ Fix

Pre-split the audio into 20-minute chunks using pydub or ffmpeg before calling Whisper: from pydub import AudioSegment; chunks = [audio[i:i+20*60*1000] for i in range(0, len(audio), 20*60*1000)], then loop through client.audio.transcriptions.create() for each chunk

OpenAI API library is buffering the entire file in memory before sending, inflating apparent size

✓ Fix

Use open(file_path, 'rb') context manager to stream the file directly: with open('audio.mp3', 'rb') as f: response = client.audio.transcriptions.create(model='whisper-1', file=f), which avoids memory buffering

Code: broken vs fixed

Broken - triggers the error

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

# This will fail if audio.wav > 25MB
response = client.audio.transcriptions.create(
    model='whisper-1',
    file=open('audio.wav', 'rb')  # ❌ NO SIZE CHECK — if >25MB, 413 error
)
print(response.text)

Fixed - works correctly

python

import os
import subprocess
from pathlib import Path
from openai import OpenAI, APIStatusError

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

def compress_audio_if_needed(audio_path: str, max_size_mb: int = 25) -> str:
    """Compress audio to 128kbps MP3 if file exceeds max_size_mb."""
    file_size_mb = os.path.getsize(audio_path) / (1024 * 1024)
    if file_size_mb > max_size_mb:
        output_path = Path(audio_path).stem + '_compressed.mp3'
        # ✅ FIXED: Compress to 128kbps MP3 to reduce file size
        subprocess.run([
            'ffmpeg', '-i', audio_path, '-acodec', 'libmp3lame',
            '-ab', '128k', '-y', output_path
        ], check=True, capture_output=True)
        new_size_mb = os.path.getsize(output_path) / (1024 * 1024)
        print(f'Compressed {file_size_mb:.2f}MB → {new_size_mb:.2f}MB')
        return output_path
    return audio_path

try:
    audio_file = compress_audio_if_needed('audio.wav')
    with open(audio_file, 'rb') as f:
        response = client.audio.transcriptions.create(
            model='whisper-1',
            file=f
        )
    print(f'Transcription: {response.text}')
except APIStatusError as e:
    if e.status_code == 413:
        print(f'Error: File still too large. Try increasing compression bitrate.')
    raise

Added compress_audio_if_needed() to detect files >25MB and transcode them to 128kbps MP3 before upload, reducing typical file sizes by 70-80% while maintaining Whisper accuracy. API validation now passes.

⚠

Workaround

If you cannot install ffmpeg in your environment, use pydub to split audio into 20-minute segments, upload each segment separately, and concatenate transcription results: from pydub import AudioSegment; audio = AudioSegment.from_file('large.wav'); chunks = [audio[i:i+20*60*1000] for i in range(0, len(audio), 20*60*1000)]; transcripts = []; [transcripts.append(client.audio.transcriptions.create(model='whisper-1', file=chunk.export(format='mp3')).text) for chunk in chunks]; final_text = ' '.join(transcripts).

✓

Prevention

Always check file size before calling Whisper API (if os.path.getsize(file) > 25MB, compress). For production pipelines, implement automatic compression middleware that targets 64-128 kbps MP3 as the default upload format. For very large audio archives, pre-process audio at ingest time, not at transcription request time.

Python 3.9+ · openai >=1.0.0 · tested on 1.3+

Verified 2026-04 · whisper-1

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.