ValueError
ValueError: compute_type float16 is not supported on CPU
Stack trace
ValueError: compute_type float16 is not supported on CPU
File "/path/to/site-packages/faster_whisper/transcriber.py", line 142, in __init__
raise ValueError(f"compute_type {compute_type} is not supported on {device}")
ValueError: compute_type float16 is not supported on CPU Why it happens
Faster-Whisper's float16 compute type is a half-precision floating-point format that requires specialized NVIDIA CUDA hardware to execute efficiently. CPUs cannot natively compute float16 operations at the performance level faster-whisper expects, so the library explicitly blocks this configuration at initialization time to prevent silent performance degradation or crashes.
Detection
Check your device detection logic before creating WhisperModel: verify that if compute_type='float16', you are actually running on a GPU (check torch.cuda.is_available() or via ctranslate2.get_device()). Log both the detected device and the requested compute_type to catch mismatches early in your pipeline.
Causes & fixes
Hardcoded compute_type='float16' on a machine with no GPU, or GPU not detected
Change to compute_type='int8' (CPU-compatible, still fast) or compute_type='float32' (slower but compatible). Or check torch.cuda.is_available() and only use float16 when GPU is present.
GPU was available during development but deployment runs on CPU-only hardware
Add device detection at startup: use ctranslate2.get_available_compute_type('cuda') to check if float16 is supported, fall back to 'int8' if not.
CUDA is installed but not in PATH, so ctranslate2 thinks no GPU is available
Verify NVIDIA CUDA toolkit is installed and in system PATH. Run `nvidia-smi` to confirm GPU detection. Rebuild ctranslate2 wheel after installing CUDA.
Using an older CPU with no AVX2 instruction set support, breaking int8 quantization fallback
Use compute_type='float32' on very old CPUs, or upgrade hardware. Check CPU instruction set with `python -c "import platform; print(platform.processor())"`.
Code: broken vs fixed
from faster_whisper import WhisperModel
import torch
# BROKEN: hardcoded float16 without checking device
model = WhisperModel('large-v3', compute_type='float16') # This raises ValueError on CPU
audio_path = 'speech.mp3'
segments, info = model.transcribe(audio_path) from faster_whisper import WhisperModel
import torch
import os
# FIXED: detect device and choose compatible compute_type
device = 'cuda' if torch.cuda.is_available() else 'cpu'
compute_type = 'float16' if device == 'cuda' else 'int8' # Use int8 on CPU, float16 on GPU
model = WhisperModel('large-v3', device=device, compute_type=compute_type)
audio_path = 'speech.mp3'
segments, info = model.transcribe(audio_path)
for segment in segments:
print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
print(f"\nTranscription complete. Device: {device}, Compute Type: {compute_type}") Workaround
If you cannot change the compute_type at initialization, wrap the WhisperModel creation in a try-except block, catch the ValueError mentioning float16, and retry with compute_type='int8'. This allows graceful fallback: `try: model = WhisperModel(..., compute_type='float16') except ValueError: model = WhisperModel(..., compute_type='int8')`.
Prevention
Implement a helper function that detects available hardware and returns the optimal compute_type before any model initialization. Store this in a config that all transcription jobs read from. Use environment variables (e.g., WHISPER_COMPUTE_TYPE) so deployment ops can override at runtime without code changes. Test your transcription pipeline on both GPU and CPU target machines before production.