How to transcribe audio with OpenAI Whisper API
Quick answer
Use the OpenAI Whisper API by calling
client.audio.transcriptions.create with your audio file and model="whisper-1". This returns a transcription text from the audio input. Authenticate with your OpenAI API key and provide supported audio formats like mp3 or wav.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable.
- Install SDK:
pip install openai - Set API key in your shell:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai Step by step
This example demonstrates how to transcribe an audio file using the OpenAI Whisper API with the whisper-1 model. Supported audio formats include mp3, wav, m4a, and more.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Path to your audio file
audio_file_path = "audio.mp3"
with open(audio_file_path, "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print("Transcription:", transcript.text) output
Transcription: Hello, this is a sample audio transcription using OpenAI Whisper API.
Common variations
- Async usage: Use an async client and
awaitthe transcription call. - Different audio formats: Whisper supports mp3, mp4, mpeg, mpga, m4a, wav, webm.
- Local transcription: Use the
openai-whisperPython package for offline transcription.
import asyncio
import os
from openai import OpenAI
async def transcribe_async():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
with open("audio.mp3", "rb") as audio_file:
transcript = await client.audio.transcriptions.acreate(
model="whisper-1",
file=audio_file
)
print("Async transcription:", transcript.text)
asyncio.run(transcribe_async()) output
Async transcription: Hello, this is a sample audio transcription using OpenAI Whisper API.
Troubleshooting
- If you get
Invalid file format, ensure your audio file is one of the supported formats and not corrupted. - If transcription is inaccurate, try using higher quality audio or a different Whisper model if available.
- For
AuthenticationError, verify yourOPENAI_API_KEYenvironment variable is set correctly.
Key Takeaways
- Use
client.audio.transcriptions.createwithmodel="whisper-1"to transcribe audio files via OpenAI API. - Supported audio formats include mp3, wav, m4a, mp4, and webm; file size limit is 25MB for API calls.
- Set your OpenAI API key in the environment variable
OPENAI_API_KEYbefore running code. - Async transcription is supported with
acreatemethod for integration in async apps. - For local offline transcription, use the
openai-whisperPython package instead of the API.