How to beginner · 3 min read

How to choose Whisper model size

Quick answer

Choose a Whisper model size based on your accuracy needs and available compute resources. Larger models like whisper-large provide higher transcription accuracy but require more memory and CPU/GPU power, while smaller models like whisper-small or whisper-base offer faster inference with lower resource use but less accuracy.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the OpenAI Python package to access Whisper models via the API. Set your OPENAI_API_KEY environment variable for authentication.

bash

pip install openai

Step by step

Use the OpenAI API to transcribe audio with a chosen Whisper model size. The example below uses whisper-1, the standard model, but you can swap in smaller or larger variants if available.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )

print("Transcription:", transcript.text)

output

Transcription: Hello, this is a sample audio transcription.

Common variations

Whisper models vary by size: tiny, base, small, medium, and large. Smaller models run faster and use less memory but have lower accuracy. Larger models improve transcription quality, especially on noisy or accented audio, but require more compute.

Use local open-source Whisper models with openai-whisper or whisper.cpp for offline use and control over model size.

Model size	Accuracy	Speed	Memory usage
tiny	Lowest	Fastest	Lowest
base	Low	Fast	Low
small	Moderate	Moderate	Moderate
medium	High	Slower	High
large	Highest	Slowest	Highest

Troubleshooting

If transcription is inaccurate, try a larger model size or improve audio quality.
If you encounter memory errors, switch to a smaller model or use streaming transcription.
For slow inference, consider running smaller models locally or using GPU acceleration.

✅

Key Takeaways

Select Whisper model size balancing accuracy and resource constraints.
Use larger models for noisy or complex audio, smaller for speed and low resource use.
OpenAI API uses whisper-1 as the default production model.
Local open-source Whisper allows explicit model size choice for offline use.
Test different sizes on your audio to find the best tradeoff.

Verified 2026-04 · whisper-1, whisper-tiny, whisper-base, whisper-small, whisper-medium, whisper-large

Verify ↗