Code beginner · 3 min read

How to use Whisper API in python

Q: How to use Whisper API in python

Use the OpenAI Python SDK's client.audio.transcriptions.create method with your audio file and model set to whisper-1 to transcribe audio in Python.

Direct answer

Use the OpenAI Python SDK's client.audio.transcriptions.create method with your audio file and model set to whisper-1 to transcribe audio in Python.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

import os
from openai import OpenAI

Examples

inAudio file: 'speech.mp3' (English speech)

outTranscription text: "Hello, this is a test of the Whisper API."

inAudio file: 'interview.wav' (Interview in English)

outTranscription text: "Today we discuss the future of AI and technology."

inAudio file: 'spanish_audio.mp3' (Spanish speech)

outTranscription text: "Hola, esta es una prueba de la API Whisper."

Integration steps

Install the OpenAI Python SDK and set your API key in the environment variable OPENAI_API_KEY.
Import the OpenAI client and initialize it with your API key from os.environ.
Open your audio file in binary mode.
Call client.audio.transcriptions.create with the file, model='whisper-1', and optionally specify language.
Extract the transcription text from the response's 'text' field.
Use or display the transcription as needed.

Full code

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Path to your audio file
audio_file_path = "speech.mp3"

with open(audio_file_path, "rb") as audio_file:
    transcription = client.audio.transcriptions.create(
        file=audio_file,
        model="whisper-1"
    )

print("Transcription:", transcription.text)

output

Transcription: Hello, this is a test of the Whisper API.

API trace

Request

json

{"model": "whisper-1", "file": <binary audio data>}

Response

json

{"text": "Hello, this is a test of the Whisper API."}

Extracttranscription.text

Variants

Specify language for better accuracy ›

Use when you know the audio language to improve transcription accuracy.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("spanish_audio.mp3", "rb") as audio_file:
    transcription = client.audio.transcriptions.create(
        file=audio_file,
        model="whisper-1",
        language="es"
    )

print("Transcription:", transcription.text)

Use Whisper API for translation ›

Use when you want to translate audio speech to English instead of just transcribing.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

with open("french_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        file=audio_file,
        model="whisper-1"
    )

print("Translation:", translation.text)

Async transcription call ›

Use async version for concurrent transcription calls in async applications.

python

import os
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def transcribe():
    with open("speech.mp3", "rb") as audio_file:
        transcription = await client.audio.transcriptions.acreate(
            file=audio_file,
            model="whisper-1"
        )
    print("Transcription:", transcription.text)

asyncio.run(transcribe())

Performance

Latency~2-5 seconds per minute of audio depending on file size and network

Cost~$0.006 per minute of audio processed with Whisper API

Rate limitsDefault tier: 60 requests per minute, check OpenAI docs for updates

Use compressed audio formats like mp3 or m4a to reduce upload size.
Trim silence or irrelevant parts before sending audio to reduce cost.
Specify language to avoid extra processing and improve speed.

Approach	Latency	Cost/call	Best for
Standard transcription	~2-5s per minute	~$0.006/min	General audio transcription
Translation endpoint	~3-6s per minute	~$0.006/min	Transcribing and translating non-English audio
Async transcription	Varies, concurrent calls	~$0.006/min	High throughput or async apps

✓

Quick tip

Always specify the audio language in <code>language</code> parameter to improve Whisper transcription accuracy.

⚠

Common mistake

Beginners often forget to open the audio file in binary mode ('rb'), causing the API call to fail.

Verified 2026-04 · whisper-1

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.