How to beginner · 3 min read

How to choose Whisper model size

Quick answer
Choose a Whisper model size based on your accuracy needs and available compute resources. Larger models like whisper-large provide higher transcription accuracy but require more memory and CPU/GPU power, while smaller models like whisper-small or whisper-base offer faster inference with lower resource use but less accuracy.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the OpenAI Python package to access Whisper models via the API. Set your OPENAI_API_KEY environment variable for authentication.

bash
pip install openai

Step by step

Use the OpenAI API to transcribe audio with a chosen Whisper model size. The example below uses whisper-1, the standard model, but you can swap in smaller or larger variants if available.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )

print("Transcription:", transcript.text)
output
Transcription: Hello, this is a sample audio transcription.

Common variations

Whisper models vary by size: tiny, base, small, medium, and large. Smaller models run faster and use less memory but have lower accuracy. Larger models improve transcription quality, especially on noisy or accented audio, but require more compute.

Use local open-source Whisper models with openai-whisper or whisper.cpp for offline use and control over model size.

Model sizeAccuracySpeedMemory usage
tinyLowestFastestLowest
baseLowFastLow
smallModerateModerateModerate
mediumHighSlowerHigh
largeHighestSlowestHighest

Troubleshooting

  • If transcription is inaccurate, try a larger model size or improve audio quality.
  • If you encounter memory errors, switch to a smaller model or use streaming transcription.
  • For slow inference, consider running smaller models locally or using GPU acceleration.

Key Takeaways

  • Select Whisper model size balancing accuracy and resource constraints.
  • Use larger models for noisy or complex audio, smaller for speed and low resource use.
  • OpenAI API uses whisper-1 as the default production model.
  • Local open-source Whisper allows explicit model size choice for offline use.
  • Test different sizes on your audio to find the best tradeoff.
Verified 2026-04 · whisper-1, whisper-tiny, whisper-base, whisper-small, whisper-medium, whisper-large
Verify ↗