Tensor device: cpu vs cuda
Why this matters
If you mix CPU and GPU tensors in operations, PyTorch crashes. Knowing how to check and move tensors is essential before training any model on GPU.
Explanation
Every PyTorch tensor has a device attribute that specifies where it lives: CPU RAM or GPU memory (CUDA). Tensors can only perform operations with other tensors on the same device.
When you create a tensor without specifying a device, it defaults to CPU. To use GPU acceleration, you must either (1) create tensors directly on GPU, or (2) move CPU tensors to GPU using .to(device) or .cuda(). The device object is either torch.device('cpu') or torch.device('cuda'). You can check if CUDA is available on your machine with torch.cuda.is_available().
The pattern is: check if GPU exists, set a device variable, then move all tensors and your model to that device at the start of your training script.
Analogy
Think of CPU and GPU as two separate mail rooms. A letter (tensor) sitting in the CPU mail room can't be processed by the GPU mail room workers: you must physically move the letter to the GPU mail room first. Once it's there, the GPU workers can operate on it at high speed.
Code
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA device count: {torch.cuda.device_count()}")
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
tensor_cpu = torch.randn(3, 4)
print(f"tensor_cpu device: {tensor_cpu.device}")
tensor_gpu = tensor_cpu.to(device)
print(f"tensor_gpu device: {tensor_gpu.device}")
tensor_back = tensor_gpu.cpu()
print(f"tensor_back device: {tensor_back.device}")
try:
result = tensor_cpu + tensor_gpu
except RuntimeError as e:
print(f"Error when mixing devices: {type(e).__name__}")
tensor_cpu2 = torch.randn(3, 4)
result = tensor_cpu + tensor_cpu2
print(f"Same device operation successful: {result.device}") CUDA available: False CUDA device count: 0 Using device: cpu tensor_cpu device: cpu tensor_gpu device: cpu tensor_back device: cpu Error when mixing devices: RuntimeError Same device operation successful: cpu
What just happened?
The code checked for GPU availability (returned False in a CPU-only environment), created a device variable that defaults to CPU, created a tensor on CPU, attempted to move it (stayed on CPU since CUDA unavailable), tried to add CPU and GPU tensors (would fail on a GPU machine), then successfully added two CPU tensors. The key line is <code>.to(device)</code> which synchronously moves the tensor to the specified device.
Common gotcha
The most common mistake: creating a model, moving it to GPU with model.to(device), but then forgetting to move your data (input tensors and labels) to the same device before passing them to the model. The model is on GPU but your batch is on CPU: boom, RuntimeError. Always move both model AND data.
Error recovery
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!RuntimeError: CUDA out of memoryExperienced dev note
A pattern that saves hours of debugging: always set device at the top of your script, then use it consistently. Even better, pass device as a parameter through your training function. One gotcha: .to(device) returns a new tensor on the target device: it doesn't modify in-place. Always capture the return value or use the in-place version .to(device, non_blocking=True) in performance-critical code. The non_blocking=True flag lets other work continue while the transfer happens asynchronously (advanced, but worth knowing).
Check your understanding
If you have a model on CUDA and a batch of data on CPU, what happens when you call model(batch)? Why? How would you fix it with minimal code changes?
Show answer hint
A correct answer identifies that the forward pass will fail with a device mismatch error because the model weights are on CUDA but inputs are on CPU. The fix is <code>model(batch.to(device))</code> or moving the batch to device before any operation.
torch.device() and .to() methods are stable. No breaking changes in device handling between these versions.