Code beginner · 3 min read

How to use Whisper API in Python

Direct answer
Use the OpenAI Python SDK to call client.audio.transcriptions.create with model="whisper-1" and an audio file to transcribe audio using the Whisper API.

Setup

Install
bash
pip install openai
Env vars
OPENAI_API_KEY
Imports
python
from openai import OpenAI
import os

Examples

inTranscribe a short mp3 audio file 'speech.mp3'
outTranscription text: "Hello, this is a test of the Whisper API."
inTranscribe a wav file 'meeting.wav' with clear speech
outTranscription text: "Today’s meeting covered project milestones and deadlines."
inTranscribe a noisy audio file 'noisy_audio.mp3'
outTranscription text: "Despite background noise, the main points were captured accurately."

Integration steps

  1. Install the OpenAI Python SDK and set the OPENAI_API_KEY environment variable.
  2. Import the OpenAI client and initialize it with the API key from environment variables.
  3. Open the audio file in binary mode for reading.
  4. Call client.audio.transcriptions.create with model="whisper-1" and the audio file object.
  5. Extract the transcription text from the response's text field.
  6. Print or use the transcribed text as needed.

Full code

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Replace 'audio.mp3' with your audio file path
with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )

print("Transcription text:", transcript.text)
output
Transcription text: Hello, this is a test of the Whisper API.

API trace

Request
json
{"model": "whisper-1", "file": <binary audio file>}
Response
json
{"text": "Transcribed text string", "language": "en", "duration": 12.3}
Extractresponse.text

Variants

Async version

Use when integrating Whisper transcription in asynchronous Python applications for concurrency.

python
import asyncio
from openai import OpenAI
import os

async def transcribe_async():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    with open("audio.mp3", "rb") as audio_file:
        transcript = await client.audio.transcriptions.acreate(
            model="whisper-1",
            file=audio_file
        )
    print("Transcription text:", transcript.text)

asyncio.run(transcribe_async())
Local Whisper transcription (offline)

Use when you want to transcribe audio locally without API calls or internet dependency.

python
import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print("Transcription text:", result["text"])
Specify language parameter

Use when you know the audio language in advance to improve transcription accuracy.

python
from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        language="en"
    )

print("Transcription text:", transcript.text)

Performance

Latency~3-10 seconds per minute of audio depending on file size and network speed
Cost~$0.006 per minute of audio for Whisper API transcription
Rate limitsDefault tier: 60 requests per minute, 1000 minutes per month (check OpenAI docs for updates)
  • Trim audio to only the needed segments to reduce cost and latency.
  • Use the language parameter if known to improve transcription speed and accuracy.
  • Avoid re-uploading the same audio multiple times; cache transcripts when possible.
ApproachLatencyCost/callBest for
Standard Whisper API call~3-10s per audio minute~$0.006/minReliable cloud transcription with minimal setup
Async Whisper API call~3-10s per audio minute~$0.006/minConcurrent transcription in async apps
Local Whisper modelVaries by hardware (seconds to minutes)Free (local compute cost)Offline transcription without API dependency

Quick tip

Always open your audio file in binary mode ('rb') before passing it to <code>client.audio.transcriptions.create</code> to avoid file read errors.

Common mistake

Forgetting to open the audio file in binary mode or passing a file path string instead of a file object causes API errors.

Verified 2026-04 · whisper-1
Verify ↗