Code beginner · 3 min read

How to use Whisper API in Python

Q: How to use Whisper API in Python

Use the OpenAI Python SDK to call client.audio.transcriptions.create with model="whisper-1" and an audio file to transcribe audio using the Whisper API.

Direct answer

Use the OpenAI Python SDK to call client.audio.transcriptions.create with model="whisper-1" and an audio file to transcribe audio using the Whisper API.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

from openai import OpenAI
import os

Examples

inTranscribe a short mp3 audio file 'speech.mp3'

outTranscription text: "Hello, this is a test of the Whisper API."

inTranscribe a wav file 'meeting.wav' with clear speech

outTranscription text: "Today’s meeting covered project milestones and deadlines."

inTranscribe a noisy audio file 'noisy_audio.mp3'

outTranscription text: "Despite background noise, the main points were captured accurately."

Integration steps

Install the OpenAI Python SDK and set the OPENAI_API_KEY environment variable.
Import the OpenAI client and initialize it with the API key from environment variables.
Open the audio file in binary mode for reading.
Call client.audio.transcriptions.create with model="whisper-1" and the audio file object.
Extract the transcription text from the response's text field.
Print or use the transcribed text as needed.

Full code

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Replace 'audio.mp3' with your audio file path
with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file
    )

print("Transcription text:", transcript.text)

output

Transcription text: Hello, this is a test of the Whisper API.

API trace

Request

json

{"model": "whisper-1", "file": <binary audio file>}

Response

json

{"text": "Transcribed text string", "language": "en", "duration": 12.3}

Extractresponse.text

Variants

Async version ›

Use when integrating Whisper transcription in asynchronous Python applications for concurrency.

python

import asyncio
from openai import OpenAI
import os

async def transcribe_async():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    with open("audio.mp3", "rb") as audio_file:
        transcript = await client.audio.transcriptions.acreate(
            model="whisper-1",
            file=audio_file
        )
    print("Transcription text:", transcript.text)

asyncio.run(transcribe_async())

Local Whisper transcription (offline) ›

Use when you want to transcribe audio locally without API calls or internet dependency.

python

import whisper

model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print("Transcription text:", result["text"])

Specify language parameter ›

Use when you know the audio language in advance to improve transcription accuracy.

python

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        language="en"
    )

print("Transcription text:", transcript.text)

Performance

Latency~3-10 seconds per minute of audio depending on file size and network speed

Cost~$0.006 per minute of audio for Whisper API transcription

Rate limitsDefault tier: 60 requests per minute, 1000 minutes per month (check OpenAI docs for updates)

Trim audio to only the needed segments to reduce cost and latency.
Use the language parameter if known to improve transcription speed and accuracy.
Avoid re-uploading the same audio multiple times; cache transcripts when possible.

Approach	Latency	Cost/call	Best for
Standard Whisper API call	~3-10s per audio minute	~$0.006/min	Reliable cloud transcription with minimal setup
Async Whisper API call	~3-10s per audio minute	~$0.006/min	Concurrent transcription in async apps
Local Whisper model	Varies by hardware (seconds to minutes)	Free (local compute cost)	Offline transcription without API dependency

✓

Quick tip

Always open your audio file in binary mode ('rb') before passing it to <code>client.audio.transcriptions.create</code> to avoid file read errors.

⚠

Common mistake

Forgetting to open the audio file in binary mode or passing a file path string instead of a file object causes API errors.

Verified 2026-04 · whisper-1

Verify ↗