How to transcribe audio with OpenAI in python
Quick answer
Use the OpenAI Python SDK's
client.audio.transcriptions.create method with your audio file and model (e.g., whisper-1) to transcribe audio to text. Provide the audio file as a binary stream and specify the model to get the transcription result.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the OpenAI Python SDK and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 Step by step
This example demonstrates how to transcribe an audio file (e.g., WAV or MP3) using OpenAI's Whisper model whisper-1. The audio file is read in binary mode and sent to the audio.transcriptions.create endpoint.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Path to your audio file
audio_file_path = "audio_sample.wav"
with open(audio_file_path, "rb") as audio_file:
transcription = client.audio.transcriptions.create(
file=audio_file,
model="whisper-1"
)
print("Transcription:", transcription.text) output
Transcription: Hello, this is a sample audio transcription using OpenAI Whisper.
Common variations
- Use different audio formats supported by Whisper such as MP3, MP4, or FLAC.
- Specify the
languageparameter to improve accuracy for known languages. - Use asynchronous calls or integrate with frameworks for streaming audio transcription.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
with open("audio_sample.mp3", "rb") as audio_file:
transcription = client.audio.transcriptions.create(
file=audio_file,
model="whisper-1",
language="en"
)
print("Transcription:", transcription.text) output
Transcription: This is an English audio transcription example.
Troubleshooting
- If you get a
FileNotFoundError, verify the audio file path is correct. - If the transcription is empty or inaccurate, check the audio quality and format.
- Ensure your API key is set correctly in the environment variable
OPENAI_API_KEY. - For large files, consider chunking or using streaming transcription if supported.
Key Takeaways
- Use
client.audio.transcriptions.createwithwhisper-1to transcribe audio files. - Always read audio files in binary mode and pass the file object to the API.
- Set the
languageparameter to improve transcription accuracy when known. - Verify your environment variable
OPENAI_API_KEYis correctly set before running code. - Check audio format compatibility and file path correctness to avoid common errors.