How to beginner · 3 min read

How to run Whisper on CPU

Quick answer
Run Whisper on CPU by using the openai-whisper Python package or the whisper library with torch.device('cpu'). Load the model specifying CPU device to transcribe audio without GPU acceleration.

PREREQUISITES

  • Python 3.8+
  • pip install torch>=1.13.1
  • pip install openai-whisper or pip install whisper

Setup

Install the required packages to run Whisper on CPU. Use pip to install torch with CPU support and the Whisper package.

bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install openai-whisper

Step by step

Use the whisper Python package to load the model on CPU and transcribe an audio file. This example loads the base model on CPU and prints the transcription.

python
import whisper

# Load Whisper model on CPU explicitly
model = whisper.load_model("base", device="cpu")

# Transcribe audio file
result = model.transcribe("audio.mp3")

print("Transcription:", result["text"])
output
Transcription: Hello, this is a test audio transcription using Whisper on CPU.

Common variations

  • Use different Whisper model sizes like tiny, small, or medium for faster or more accurate transcription.
  • Run asynchronously with asyncio if integrating into async apps.
  • Use OpenAI's Whisper API for cloud transcription instead of local CPU inference.
python
import whisper

# Load a smaller model for faster CPU transcription
model = whisper.load_model("tiny", device="cpu")

result = model.transcribe("audio.mp3")
print(result["text"])
output
Transcription: Hello, this is a test audio transcription using Whisper on CPU.

Troubleshooting

  • If you get CUDA errors, ensure device="cpu" is set explicitly.
  • For slow transcription, try smaller models like tiny or base.
  • Check that your audio file format is supported (mp3, wav, m4a, etc.).
  • Ensure torch is installed with CPU-only binaries if no GPU is present.

Key Takeaways

  • Install Whisper and CPU-only PyTorch to run Whisper locally without GPU.
  • Load Whisper models with device="cpu" to force CPU inference.
  • Use smaller Whisper models for faster CPU transcription at some accuracy cost.
Verified 2026-04 · whisper-base, whisper-tiny
Verify ↗