How to run Whisper locally in Python
Direct answer
Run Whisper locally in Python by installing the openai-whisper package and using whisper.load_model to load a model, then call model.transcribe on your audio file.
Setup
Install
pip install openai-whisper Imports
import whisper Examples
inaudio.mp3
outTranscribed text of the audio content.
ininterview.wav
outFull transcript of the interview audio.
inshort_clip.m4a
outShort transcription snippet from the clip.
Integration steps
- Install the openai-whisper package via pip.
- Import the whisper module in your Python script.
- Load a Whisper model locally using whisper.load_model("base").
- Call model.transcribe() with the path to your audio file.
- Extract the 'text' field from the transcription result.
- Use or save the transcribed text as needed.
Full code
import whisper
# Load the Whisper model locally
model = whisper.load_model("base")
# Transcribe an audio file
result = model.transcribe("audio.mp3")
# Print the transcribed text
print("Transcription:", result["text"]) output
Transcription: This is a sample transcription of the audio file.
API trace
Request
N/A: local model inference with audio file path input Response
{"text": "transcribed text", "segments": [...], "language": "en"} Extract
result["text"]Variants
Use faster-whisper for improved speed ›
Use when you want faster local transcription with similar accuracy.
from faster_whisper import WhisperModel
model = WhisperModel("base", device="cpu")
segments, _ = model.transcribe("audio.mp3")
text = "".join([segment.text for segment in segments])
print("Transcription:", text) Run Whisper asynchronously ›
Use in async Python apps to avoid blocking during transcription.
import whisper
import asyncio
async def transcribe_async():
model = whisper.load_model("base")
result = await asyncio.to_thread(model.transcribe, "audio.mp3")
print("Transcription:", result["text"])
asyncio.run(transcribe_async()) Performance
Latency~5-15 seconds per minute of audio on CPU for base model
CostFree for local use; no API calls or charges
Rate limitsNone, fully local
- Use smaller Whisper models (tiny, base) for faster transcription.
- Preprocess audio to reduce length or sample rate to speed up.
- Batch multiple files if using async to maximize throughput.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| openai-whisper local base model | ~5-15s per minute audio | Free | Accurate offline transcription |
| faster-whisper local | ~3-7s per minute audio | Free | Faster local transcription |
| OpenAI Whisper API | ~1-3s per minute audio | Paid API | Cloud transcription with no local setup |
Quick tip
Choose the Whisper model size based on your accuracy and speed needs; smaller models run faster but with less accuracy.
Common mistake
Trying to transcribe audio without loading the model first or passing an invalid file path causes errors.
Community Notes
No notes yetBe the first to share a version-specific fix or tip.