How to run Whisper locally in Python
Direct answer
Run Whisper locally in Python by installing the openai-whisper package and using whisper.load_model to load a model, then call model.transcribe on your audio file.
Setup
Install
pip install openai-whisper Imports
import whisper Examples
inaudio.mp3
outTranscribed text of the audio content.
ininterview.wav
outFull transcript of the interview audio.
inshort_clip.m4a
outShort transcription snippet from the clip.
Integration steps
- Install the openai-whisper package via pip.
- Import the whisper module in your Python script.
- Load a Whisper model locally using whisper.load_model("base").
- Call model.transcribe() with the path to your audio file.
- Extract the 'text' field from the transcription result.
- Use or save the transcribed text as needed.
Full code
import whisper
# Load the Whisper model locally
model = whisper.load_model("base")
# Transcribe an audio file
result = model.transcribe("audio.mp3")
# Print the transcribed text
print("Transcription:", result["text"]) output
Transcription: This is a sample transcription of the audio file.
API trace
Request
N/A — local model inference with audio file path input Response
{"text": "transcribed text", "segments": [...], "language": "en"} Extract
result["text"]Variants
Use faster-whisper for improved speed ›
Use when you want faster local transcription with similar accuracy.
from faster_whisper import WhisperModel
model = WhisperModel("base", device="cpu")
segments, _ = model.transcribe("audio.mp3")
text = "".join([segment.text for segment in segments])
print("Transcription:", text) Run Whisper asynchronously ›
Use in async Python apps to avoid blocking during transcription.
import whisper
import asyncio
async def transcribe_async():
model = whisper.load_model("base")
result = await asyncio.to_thread(model.transcribe, "audio.mp3")
print("Transcription:", result["text"])
asyncio.run(transcribe_async()) Performance
Latency~5-15 seconds per minute of audio on CPU for base model
CostFree for local use; no API calls or charges
Rate limitsNone, fully local
- Use smaller Whisper models (tiny, base) for faster transcription.
- Preprocess audio to reduce length or sample rate to speed up.
- Batch multiple files if using async to maximize throughput.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| openai-whisper local base model | ~5-15s per minute audio | Free | Accurate offline transcription |
| faster-whisper local | ~3-7s per minute audio | Free | Faster local transcription |
| OpenAI Whisper API | ~1-3s per minute audio | Paid API | Cloud transcription with no local setup |
Quick tip
Choose the Whisper model size based on your accuracy and speed needs; smaller models run faster but with less accuracy.
Common mistake
Trying to transcribe audio without loading the model first or passing an invalid file path causes errors.