Common install failures and fixes
Why this matters
A broken PyTorch install wastes hours debugging code that actually works fine: the bottleneck is your environment. Knowing the specific error patterns gets you coding in minutes instead of frustration.
Explanation
What it is: PyTorch installation fails when three conditions misalign: CUDA compute capability, PyTorch binary version, and system architecture. The error messages often hide the real problem.
How it works mechanically: When you install PyTorch, you specify a CUDA version (11.8, 12.1, etc.) or CPU-only. PyTorch then loads a binary compiled for that exact version. If your GPU doesn't support that CUDA version, or if you installed the wrong architecture (e.g., x86_64 vs ARM), the import fails silently or crashes mid-training. The diagnostic code below checks four things: Python version match, CUDA availability, CUDA version alignment, and tensor allocation.
When to use it: Run this verification immediately after pip install torch on any new machine, container, or after a system update. If any check fails, the fixes are deterministic: reinstall with the correct flags.
Analogy
Installing PyTorch without verification is like shipping a car engine without starting it first: everything looks right until you drive 100 miles and it stalls. The diagnostic code is your test drive.
Code
import sys
import torch
print(f"Python version: {sys.version.split()[0]}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
print(f"CUDA version: {torch.version.cuda}")
print(f"cuDNN version: {torch.backends.cudnn.version()}")
print(f"GPU device: {torch.cuda.get_device_name(0)}")
print(f"GPU compute capability: {torch.cuda.get_device_capability(0)}")
try:
test_tensor = torch.randn(10, 10).cuda()
result = torch.matmul(test_tensor, test_tensor)
print(f"GPU tensor test passed. Result shape: {result.shape}")
except RuntimeError as e:
print(f"GPU tensor test FAILED: {e}")
else:
print("CUDA not available — CPU-only mode")
test_tensor = torch.randn(10, 10)
result = torch.matmul(test_tensor, test_tensor)
print(f"CPU tensor test passed. Result shape: {result.shape}") Python version: 3.11.9 PyTorch version: 2.11.0+cu121 CUDA available: True CUDA version: 12.1 cuDNN version: 8902 GPU device: NVIDIA GeForce RTX 4090 GPU compute capability: (8, 9) GPU tensor test passed. Result shape: torch.Size([10, 10]) CPU tensor test passed. Result shape: torch.Size([10, 10])
What just happened?
The code imported PyTorch and printed its version, then checked whether CUDA is available on the system. If available, it queried the GPU device name, compute capability, and CUDA/cuDNN versions. It then created a small tensor on GPU, performed a matrix multiplication, and confirmed the computation succeeded. On CPU-only systems, it skipped GPU steps and ran the tensor test on CPU instead. The output shows all nine configuration points in order: if any print statement is missing or shows an error, the installation is incomplete.
Common gotcha
The most common mistake is installing torch (CPU version) when you meant to install torch with CUDA support, or vice versa. pip install torch defaults to CPU on Linux/Mac and can fail silently: you won't know until you try to call .cuda() in production. The second gotcha: CUDA 12.1 binaries don't work on systems with only CUDA 11.8 installed: version must match exactly, not just major version.
Error recovery
RuntimeError: CUDA out of memoryImportError: libcudart.so.12 not foundRuntimeError: CUDA error: no kernel image is available for execution on the devicetorch.cuda.is_available() returns False despite nvidia-smi showing GPUExperienced dev note
The single most valuable insight: always run this diagnostic code immediately after install, even on your dev machine: not when something breaks in production. A 30-second check saves you 3 hours of debugging a model that trains fine locally but fails remotely because the environments differ. Also, pin your PyTorch version in requirements.txt as torch==2.11.0, not torch: version mismatches between dev and prod are a silent killer, and someone will eventually run pip install -r requirements.txt six months later when PyTorch 2.12 exists and hit an incompatible API.
Check your understanding
If you installed PyTorch with CUDA 12.1 support, but nvidia-smi shows your system only has CUDA 11.8 installed, what error would you see when you try to run a model on GPU, and what is the exact fix?
Show answer hint
The error is a runtime error saying CUDA libraries cannot be found or a compute capability mismatch. The fix is to reinstall PyTorch with the <code>cu118</code> wheel index to match your system's CUDA version, not the other way around: you cannot upgrade your system CUDA to match PyTorch in a dev environment.
torch.cuda.amp.autocast() context manager: use torch.amp.autocast('cuda') instead. CUDA compute capability checks are the same across all 2.x versions, but the wheel URLs and recommended CUDA versions have shifted: CUDA 11.8 and 12.1 are now the standard targets; CUDA 10.x is no longer officially supported.