Whisper supported languages list
Quick answer
The
Whisper API supports transcription and translation for over 30 languages including English, Spanish, French, German, Chinese, Japanese, Russian, Portuguese, Italian, Dutch, and more. It covers major global languages for versatile audio processing.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable.
pip install openai>=1.0 Step by step
Use the openai Python SDK to transcribe audio and specify the language code for best accuracy. Below is a sample for English audio transcription.
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
with open("audio.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
language="en"
)
print(transcript.text) output
This is the transcribed text from the audio.
Common variations
You can specify other supported language codes such as es for Spanish, fr for French, or zh for Chinese. The Whisper model also supports automatic language detection if you omit the language parameter.
with open("audio_spanish.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
language="es"
)
print(transcript.text) output
Este es el texto transcrito del audio.
Supported languages list
The Whisper API supports these languages (language codes in parentheses):
| Language |
|---|
| English (en) |
| Spanish (es) |
| French (fr) |
| German (de) |
| Chinese (zh) |
| Japanese (ja) |
| Russian (ru) |
| Portuguese (pt) |
| Italian (it) |
| Dutch (nl) |
| Korean (ko) |
| Arabic (ar) |
| Turkish (tr) |
| Polish (pl) |
| Vietnamese (vi) |
| Indonesian (id) |
| Swedish (sv) |
| Danish (da) |
| Norwegian (no) |
| Finnish (fi) |
| Czech (cs) |
| Hungarian (hu) |
| Greek (el) |
| Hebrew (he) |
| Hindi (hi) |
| Thai (th) |
| Ukrainian (uk) |
| Romanian (ro) |
| Bulgarian (bg) |
| Catalan (ca) |
| Malay (ms) |
| Slovak (sk) |
Key Takeaways
- Use
languageparameter to specify the audio language for better transcription accuracy. -
Whispersupports over 30 major languages including English, Spanish, Chinese, and more. - Omitting
languageenables automatic language detection by the model. - The
openaiPython SDK provides a simple interface for audio transcription withwhisper-1. - Ensure your audio file format is supported (mp3, mp4, mpeg, mpga, m4a, wav, webm) and under 25MB for API use.