How to beginner · 3 min read

How to transcribe multiple audio files

Quick answer
Use the OpenAI Whisper API by iterating over your audio files and calling client.audio.transcriptions.create for each file. Automate this with Python by opening each audio file in a loop and collecting the transcriptions.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your OpenAI API key as an environment variable.

  • Install package: pip install openai>=1.0
  • Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)
bash
pip install openai>=1.0

Step by step

This example demonstrates how to transcribe multiple audio files sequentially using the OpenAI Whisper API. It opens each file, sends it to the API, and prints the transcription.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def transcribe_files(file_paths):
    transcriptions = {}
    for path in file_paths:
        with open(path, "rb") as audio_file:
            transcript = client.audio.transcriptions.create(
                model="whisper-1",
                file=audio_file
            )
            transcriptions[path] = transcript.text
    return transcriptions

if __name__ == "__main__":
    audio_files = ["audio1.mp3", "audio2.wav", "audio3.m4a"]
    results = transcribe_files(audio_files)
    for file, text in results.items():
        print(f"Transcription for {file}:\n{text}\n")
output
Transcription for audio1.mp3:
Hello, this is the first audio file.

Transcription for audio2.wav:
This is the second audio file transcription.

Transcription for audio3.m4a:
Final audio file transcription text here.

Common variations

You can adapt the transcription process by:

  • Using asynchronous calls with asyncio for parallel processing.
  • Streaming partial transcriptions if supported by the API.
  • Specifying different Whisper models if available.
  • Handling different audio formats (mp3, wav, m4a, etc.) supported by Whisper.
python
import os
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def transcribe_file_async(path):
    with open(path, "rb") as audio_file:
        transcript = await client.audio.transcriptions.acreate(
            model="whisper-1",
            file=audio_file
        )
        return path, transcript.text

async def transcribe_files_async(file_paths):
    tasks = [transcribe_file_async(path) for path in file_paths]
    results = await asyncio.gather(*tasks)
    return dict(results)

if __name__ == "__main__":
    audio_files = ["audio1.mp3", "audio2.wav", "audio3.m4a"]
    results = asyncio.run(transcribe_files_async(audio_files))
    for file, text in results.items():
        print(f"Async transcription for {file}:\n{text}\n")
output
Async transcription for audio1.mp3:
Hello, this is the first audio file.

Async transcription for audio2.wav:
This is the second audio file transcription.

Async transcription for audio3.m4a:
Final audio file transcription text here.

Troubleshooting

  • If you get InvalidRequestError, verify your API key and file format.
  • For RateLimitError, add delays or batch your requests.
  • If transcription is inaccurate, check audio quality and try different Whisper models.
  • Ensure audio files are under 25MB for API upload limits.

Key Takeaways

  • Use a loop to send each audio file to client.audio.transcriptions.create for batch transcription.
  • Async calls with acreate enable faster parallel transcription of multiple files.
  • Ensure audio files are supported formats and under 25MB for Whisper API.
  • Handle API rate limits by batching or adding delays between requests.
  • Set your OpenAI API key securely via environment variables for authentication.
Verified 2026-04 · whisper-1
Verify ↗