Code beginner · 3 min read

How to run Whisper locally in Python

Direct answer
Run Whisper locally in Python by installing the openai-whisper package and using whisper.load_model to load a model, then call model.transcribe on your audio file.

Setup

Install
bash
pip install openai-whisper
Imports
python
import whisper

Examples

inaudio.mp3
outTranscribed text of the audio content.
ininterview.wav
outFull transcript of the interview audio.
inshort_clip.m4a
outShort transcription snippet from the clip.

Integration steps

  1. Install the openai-whisper package via pip.
  2. Import the whisper module in your Python script.
  3. Load a Whisper model locally using whisper.load_model("base").
  4. Call model.transcribe() with the path to your audio file.
  5. Extract the 'text' field from the transcription result.
  6. Use or save the transcribed text as needed.

Full code

python
import whisper

# Load the Whisper model locally
model = whisper.load_model("base")

# Transcribe an audio file
result = model.transcribe("audio.mp3")

# Print the transcribed text
print("Transcription:", result["text"])
output
Transcription: This is a sample transcription of the audio file.

API trace

Request
json
N/A — local model inference with audio file path input
Response
json
{"text": "transcribed text", "segments": [...], "language": "en"}
Extractresult["text"]

Variants

Use faster-whisper for improved speed

Use when you want faster local transcription with similar accuracy.

python
from faster_whisper import WhisperModel

model = WhisperModel("base", device="cpu")
segments, _ = model.transcribe("audio.mp3")
text = "".join([segment.text for segment in segments])
print("Transcription:", text)
Run Whisper asynchronously

Use in async Python apps to avoid blocking during transcription.

python
import whisper
import asyncio

async def transcribe_async():
    model = whisper.load_model("base")
    result = await asyncio.to_thread(model.transcribe, "audio.mp3")
    print("Transcription:", result["text"])

asyncio.run(transcribe_async())

Performance

Latency~5-15 seconds per minute of audio on CPU for base model
CostFree for local use; no API calls or charges
Rate limitsNone, fully local
  • Use smaller Whisper models (tiny, base) for faster transcription.
  • Preprocess audio to reduce length or sample rate to speed up.
  • Batch multiple files if using async to maximize throughput.
ApproachLatencyCost/callBest for
openai-whisper local base model~5-15s per minute audioFreeAccurate offline transcription
faster-whisper local~3-7s per minute audioFreeFaster local transcription
OpenAI Whisper API~1-3s per minute audioPaid APICloud transcription with no local setup

Quick tip

Choose the Whisper model size based on your accuracy and speed needs; smaller models run faster but with less accuracy.

Common mistake

Trying to transcribe audio without loading the model first or passing an invalid file path causes errors.

Verified 2026-04 · whisper-base, faster-whisper-base
Verify ↗