High severity intermediate · Fix: 5-10 min

ValueError

ValueError: compute_type float16 is not supported on CPU

What this error means
Faster-Whisper raises ValueError when you specify compute_type='float16' on a CPU device, because float16 hardware acceleration requires NVIDIA CUDA support.

Stack trace

traceback
ValueError: compute_type float16 is not supported on CPU
  File "/path/to/site-packages/faster_whisper/transcriber.py", line 142, in __init__
    raise ValueError(f"compute_type {compute_type} is not supported on {device}")
ValueError: compute_type float16 is not supported on CPU
QUICK FIX
Change compute_type='float16' to compute_type='int8' for CPU, or detect GPU availability and only use float16 on CUDA devices.

Why it happens

Faster-Whisper's float16 compute type is a half-precision floating-point format that requires specialized NVIDIA CUDA hardware to execute efficiently. CPUs cannot natively compute float16 operations at the performance level faster-whisper expects, so the library explicitly blocks this configuration at initialization time to prevent silent performance degradation or crashes.

Detection

Check your device detection logic before creating WhisperModel: verify that if compute_type='float16', you are actually running on a GPU (check torch.cuda.is_available() or via ctranslate2.get_device()). Log both the detected device and the requested compute_type to catch mismatches early in your pipeline.

Causes & fixes

1

Hardcoded compute_type='float16' on a machine with no GPU, or GPU not detected

✓ Fix

Change to compute_type='int8' (CPU-compatible, still fast) or compute_type='float32' (slower but compatible). Or check torch.cuda.is_available() and only use float16 when GPU is present.

2

GPU was available during development but deployment runs on CPU-only hardware

✓ Fix

Add device detection at startup: use ctranslate2.get_available_compute_type('cuda') to check if float16 is supported, fall back to 'int8' if not.

3

CUDA is installed but not in PATH, so ctranslate2 thinks no GPU is available

✓ Fix

Verify NVIDIA CUDA toolkit is installed and in system PATH. Run `nvidia-smi` to confirm GPU detection. Rebuild ctranslate2 wheel after installing CUDA.

4

Using an older CPU with no AVX2 instruction set support, breaking int8 quantization fallback

✓ Fix

Use compute_type='float32' on very old CPUs, or upgrade hardware. Check CPU instruction set with `python -c "import platform; print(platform.processor())"`.

Code: broken vs fixed

Broken - triggers the error
python
from faster_whisper import WhisperModel
import torch

# BROKEN: hardcoded float16 without checking device
model = WhisperModel('large-v3', compute_type='float16')  # This raises ValueError on CPU
audio_path = 'speech.mp3'
segments, info = model.transcribe(audio_path)
Fixed - works correctly
python
from faster_whisper import WhisperModel
import torch
import os

# FIXED: detect device and choose compatible compute_type
device = 'cuda' if torch.cuda.is_available() else 'cpu'
compute_type = 'float16' if device == 'cuda' else 'int8'  # Use int8 on CPU, float16 on GPU

model = WhisperModel('large-v3', device=device, compute_type=compute_type)
audio_path = 'speech.mp3'
segments, info = model.transcribe(audio_path)

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
print(f"\nTranscription complete. Device: {device}, Compute Type: {compute_type}")
Added device detection using torch.cuda.is_available() to choose float16 only on GPU, falling back to int8 (CPU-compatible) on CPU, preventing the ValueError.

Workaround

If you cannot change the compute_type at initialization, wrap the WhisperModel creation in a try-except block, catch the ValueError mentioning float16, and retry with compute_type='int8'. This allows graceful fallback: `try: model = WhisperModel(..., compute_type='float16') except ValueError: model = WhisperModel(..., compute_type='int8')`.

Prevention

Implement a helper function that detects available hardware and returns the optimal compute_type before any model initialization. Store this in a config that all transcription jobs read from. Use environment variables (e.g., WHISPER_COMPUTE_TYPE) so deployment ops can override at runtime without code changes. Test your transcription pipeline on both GPU and CPU target machines before production.

Python 3.8+ · faster-whisper >=0.9.0 · tested on 1.0.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.