RuntimeError
torch._C._RuntimeError
Stack trace
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Why it happens
PyTorch operations require all input tensors to reside on the same device. If one tensor is on CPU and another on GPU, operations like addition or concatenation fail with this error. This usually happens when tensors are created or moved inconsistently across devices.
Detection
Add assertions or logging to check tensor.device attributes before operations to catch device mismatches early.
Causes & fixes
Input tensors are created on CPU by default but model or other tensors are on GPU.
Explicitly move all tensors to the same device using tensor.to(device) or tensor.cuda() before operations.
Mixing tensors from different GPUs or mixing CPU and GPU tensors in the same operation.
Ensure all tensors are moved to the exact same device, e.g., 'cuda:0', before performing operations.
Loading model weights on GPU but input data remains on CPU.
Move input data tensors to the model's device with input_tensor = input_tensor.to(model.device).
Code: broken vs fixed
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6]).cuda()
z = x + y # RuntimeError: Expected all tensors on same device import os
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
x = torch.tensor([1, 2, 3]).to(device) # Moved to same device
y = torch.tensor([4, 5, 6]).to(device)
z = x + y # Works without error
print(z) Workaround
Wrap tensor operations in try/except RuntimeError, and if device mismatch occurs, move all tensors to CPU as a fallback before retrying.
Prevention
Standardize device assignment in your codebase by defining a global device variable and always moving tensors and models to this device immediately after creation or loading.