RuntimeError or ValueError
whisperx.alignment.align_model.align(): Language not supported error
Stack trace
RuntimeError: Language 'xx' is not supported for alignment. Supported languages: ['en', 'fr', 'de', 'es', 'it', 'pt', 'nl', 'ru', 'pl', 'ja', 'zh', 'ar', 'hi', 'tr']
File "/usr/local/lib/python3.11/site-packages/whisperx/alignment.py", line 142, in align
raise RuntimeError(f"Language '{language}' is not supported for alignment. Supported languages: {SUPPORTED_LANGUAGES}")
RuntimeError: Language 'uk' is not supported for alignment Why it happens
WhisperX's alignment model (used to get precise word-level timestamps) only supports a limited set of languages: approximately 14 major languages. When Whisper transcribes audio in a language like Ukrainian, Croatian, or Vietnamese, the detected language code doesn't match WhisperX's alignment model's supported language list, causing the alignment step to crash. This is a limitation of the underlying alignment model (based on Wav2Vec2), not Whisper itself.
Detection
Check the language code before calling align() and verify it against WhisperX's supported languages list. Log both the detected Whisper language and the alignment model's supported languages to catch mismatches early. Implement language fallback logic before alignment is attempted.
Causes & fixes
Whisper detected a language that WhisperX alignment model doesn't support (e.g., Ukrainian, Croatian, Vietnamese, Greek)
Use language fallback: if detected language not in SUPPORTED_LANGUAGES, use align() with a supported language closest to the original, or skip alignment entirely and use Whisper's native timestamps
Manually specified language code in align() that doesn't exist in WhisperX's supported list
Validate the language parameter against WhisperX's SUPPORTED_LANGUAGES before calling align(). Use language codes: 'en', 'fr', 'de', 'es', 'it', 'pt', 'nl', 'ru', 'pl', 'ja', 'zh', 'ar', 'hi', 'tr' only
Language code format mismatch (e.g., passing 'en-US' instead of 'en', or ISO 639-2 code instead of 639-1)
Normalize language codes to ISO 639-1 two-letter format before passing to align(). Strip region codes: convert 'en-US' to 'en', 'pt-BR' to 'pt'
Using an outdated WhisperX version with fewer supported languages
Upgrade WhisperX to latest version: pip install --upgrade whisperx. Check release notes for newly supported languages
Code: broken vs fixed
import whisper
import whisperx
import os
# BROKEN: No language validation before alignment attempt
audio_file = 'ukrainian_speech.mp3'
device = 'cuda'
model = whisper.load_model('base')
result = model.transcribe(audio_file)
# This will crash if language is not in supported list
model_a, metadata = whisperx.load_align_model(
language_code=result['language'],
device=device
)
aligned_result = whisperx.align(
result['segments'],
model_a,
metadata,
audio_file,
device,
language=result['language'] # ← CRASHES here if language not supported
)
print(aligned_result) import whisper
import whisperx
import os
# FIXED: Validate language before alignment with fallback
audio_file = 'ukrainian_speech.mp3'
device = 'cuda'
# List of languages supported by WhisperX alignment model
SUPPORTED_ALIGNMENT_LANGUAGES = {
'en', 'fr', 'de', 'es', 'it', 'pt', 'nl', 'ru', 'pl', 'ja', 'zh', 'ar', 'hi', 'tr'
}
model = whisper.load_model('base')
result = model.transcribe(audio_file)
detected_language = result['language']
print(f"Detected language: {detected_language}")
# FIXED: Check if language is supported, fallback to English if not
if detected_language not in SUPPORTED_ALIGNMENT_LANGUAGES:
print(f"Warning: Language '{detected_language}' not supported for alignment. Falling back to 'en'.")
alignment_language = 'en' # Fallback to English for alignment
else:
alignment_language = detected_language
try:
model_a, metadata = whisperx.load_align_model(
language_code=alignment_language,
device=device
)
aligned_result = whisperx.align(
result['segments'],
model_a,
metadata,
audio_file,
device,
language=alignment_language # ← Now guaranteed to be supported
)
print(f"Alignment successful for language: {alignment_language}")
print(aligned_result)
except RuntimeError as e:
if "not supported" in str(e):
print(f"Alignment failed: {e}. Skipping alignment step.")
aligned_result = result # Use Whisper's native timestamps as fallback
else:
raise Workaround
If you cannot upgrade WhisperX or need to preserve the original language, skip the alignment step entirely and use Whisper's native segment timestamps (result['segments'][i]['start'] and 'end'). Word-level timestamps will be unavailable, but the transcription and segment timing remain accurate. Alternatively, use translation preprocessing: detect the language, if unsupported, translate to English, transcribe in English, then translate the output back to the original language.
Prevention
Build a language-aware transcription pipeline that checks supported languages upfront. Store WhisperX's supported language list in your config and validate both Whisper input and detected language against this list before processing. Implement graceful degradation: if alignment fails, log the failure and continue with Whisper-native timestamps instead of crashing. Consider using a language detection service (e.g., langdetect or textblob) as a secondary check before Whisper to catch unsupported languages early and alert users.