Code beginner · 3 min read

How to run Whisper locally in Python

Direct answer

Run Whisper locally in Python by installing the openai-whisper package and using whisper.load_model to load a model, then call model.transcribe on your audio file.

Setup

Install

bash

pip install openai-whisper

Imports

python

import whisper

Examples

inaudio.mp3

outTranscribed text of the audio content.

ininterview.wav

outFull transcript of the interview audio.

inshort_clip.m4a

outShort transcription snippet from the clip.

Integration steps

Install the openai-whisper package via pip.
Import the whisper module in your Python script.
Load a Whisper model locally using whisper.load_model("base").
Call model.transcribe() with the path to your audio file.
Extract the 'text' field from the transcription result.
Use or save the transcribed text as needed.

Full code

python

import whisper

# Load the Whisper model locally
model = whisper.load_model("base")

# Transcribe an audio file
result = model.transcribe("audio.mp3")

# Print the transcribed text
print("Transcription:", result["text"])

output

Transcription: This is a sample transcription of the audio file.

API trace

Request

json

N/A — local model inference with audio file path input

Response

json

{"text": "transcribed text", "segments": [...], "language": "en"}

Extractresult["text"]

Variants

Use faster-whisper for improved speed ›

Use when you want faster local transcription with similar accuracy.

python

from faster_whisper import WhisperModel

model = WhisperModel("base", device="cpu")
segments, _ = model.transcribe("audio.mp3")
text = "".join([segment.text for segment in segments])
print("Transcription:", text)

Run Whisper asynchronously ›

Use in async Python apps to avoid blocking during transcription.

python

import whisper
import asyncio

async def transcribe_async():
    model = whisper.load_model("base")
    result = await asyncio.to_thread(model.transcribe, "audio.mp3")
    print("Transcription:", result["text"])

asyncio.run(transcribe_async())

Performance

Latency~5-15 seconds per minute of audio on CPU for base model

CostFree for local use; no API calls or charges

Rate limitsNone, fully local

Use smaller Whisper models (tiny, base) for faster transcription.
Preprocess audio to reduce length or sample rate to speed up.
Batch multiple files if using async to maximize throughput.

Approach	Latency	Cost/call	Best for
openai-whisper local base model	~5-15s per minute audio	Free	Accurate offline transcription
faster-whisper local	~3-7s per minute audio	Free	Faster local transcription
OpenAI Whisper API	~1-3s per minute audio	Paid API	Cloud transcription with no local setup

✓

Quick tip

Choose the Whisper model size based on your accuracy and speed needs; smaller models run faster but with less accuracy.

⚠

Common mistake

Trying to transcribe audio without loading the model first or passing an invalid file path causes errors.

Verified 2026-04 · whisper-base, faster-whisper-base

Verify ↗