Code Beginner easy · 4 min

Arithmetic: add, sub, mul, div

What you will learn

PyTorch tensors support element-wise arithmetic operations just like NumPy arrays, and these operations preserve gradients for backpropagation.

Why this matters

All neural network computations are built on tensor arithmetic: understanding how operations work at this level is the foundation for building models, and knowing which operations track gradients is critical for training.

Skip if: Do not use PyTorch tensor arithmetic when you're working with pure numerical data that never needs gradients (like image preprocessing or data loading). NumPy is faster and lighter for one-time transformations. Use PyTorch only when the computation is part of a differentiable pipeline or when you're building trainable models.

Explanation

Arithmetic operations in PyTorch perform element-wise math on tensors. The four basic operations: addition, subtraction, multiplication, and division: work intuitively and produce new tensors with the same shape as the inputs (when shapes are compatible). How it works: When you add two tensors, PyTorch matches shapes via broadcasting rules (smaller tensors expand to match larger ones), then applies the operation element-by-element. Unlike NumPy, every arithmetic operation on a PyTorch tensor preserves the computational graph: meaning gradients can flow backward through these operations during backpropagation. When to use: Use PyTorch arithmetic whenever tensors are part of a model or loss computation. Use NumPy for one-off data transformations that don't need differentiation.

Analogy

Tensor arithmetic is like doing the same math operation on every number in a spreadsheet simultaneously. Just as a spreadsheet broadcasts a formula across columns, PyTorch broadcasts smaller tensors to match larger ones before operating.

Code

python

import torch

# Create two simple tensors
a = torch.tensor([1.0, 2.0, 3.0])
b = torch.tensor([10.0, 20.0, 30.0])

# Addition
result_add = a + b
print(f"Addition: {result_add}")

# Subtraction
result_sub = b - a
print(f"Subtraction: {result_sub}")

# Multiplication (element-wise)
result_mul = a * b
print(f"Multiplication: {result_mul}")

# Division (element-wise)
result_div = b / a
print(f"Division: {result_div}")

# Broadcasting example: scalar and tensor
scalar = torch.tensor(2.0)
result_broadcast = a * scalar
print(f"Broadcasting (scalar * tensor): {result_broadcast}")

# Verify gradients are tracked
a_with_grad = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
result = a_with_grad + torch.tensor([5.0, 5.0, 5.0])
print(f"\nGradient tracking enabled: {result.requires_grad}")

Output

Addition: tensor([ 11.,  22.,  33.])
Subtraction: tensor([  9.,  18.,  27.])
Multiplication: tensor([ 10.,  40.,  90.])
Division: tensor([10., 10., 10.])
Broadcasting (scalar * tensor): tensor([2., 4., 6.])

Gradient tracking enabled: True

What just happened?

The code created two tensors and performed four arithmetic operations on them element-by-element, producing new tensors with matching shapes. Then it demonstrated broadcasting: a scalar tensor was multiplied with a 3-element tensor, and the scalar was automatically expanded to match the shape. Finally, it showed that arithmetic operations on tensors with `requires_grad=True` preserve the gradient tracking flag, meaning backpropagation can flow through these operations.

Common gotcha

Many developers assume tensor arithmetic creates a copy of data. It doesn't: operations create new tensor objects, but the underlying data is still on the same device (CPU or GPU). More critically, a common mistake is forgetting that a / b performs floating-point division, not integer division. If both tensors are integers and you expect integer division, use torch.div(a, b, rounding_mode='floor'). Without specifying the mode, you'll get a float result even if inputs are int64.

Error recovery

RuntimeError: broadcast size mismatch

Tensors have incompatible shapes for broadcasting. Broadcasting only works if dimensions align from the right or one dimension is 1. Example: shape (3, 4) and (4,) work; (3, 4) and (5,) don't. Reshape or expand one tensor to match.

TypeError: unsupported operand type(s)

Mixing tensor types incorrectly (e.g., int64 tensor divided by int64 tensor without float conversion). Convert to float: torch.tensor([1, 2, 3], dtype=torch.float32) before division.

RuntimeError: CUDA out of memory

Arithmetic operations on very large GPU tensors can exceed memory. Use smaller batch sizes, use mixed precision with torch.amp.autocast('cuda'), or move to CPU if the operation is not part of the training loop.

Experienced dev note

In-place operations (like `a += b` or `a.add_(b)`) save memory but break the gradient graph if `a` requires gradients. Avoid in-place ops on tensors you'll backprop through. Also, tensor arithmetic is not the same as matrix multiplication: use `@` or `torch.matmul()` for that. A silent bug: if you add a Python scalar (int/float) to a tensor, it works, but gradients won't track through the scalar side. Keep everything as tensors during forward passes.

Check your understanding

You have two tensors: `x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)` and `y = torch.tensor([[10.0], [20.0], [30.0]])`. What happens when you compute `z = x + y`, and why? (Hint: what shapes do they have, and what shape is the result?)

Show answer hint

A correct answer explains broadcasting: x is shape (3,), y is shape (3, 1). During addition, x broadcasts to (3, 1) to match y, and the result is shape (3, 3). The answer must also note that z.requires_grad will be True because at least one input has requires_grad=True.

VERSION No breaking changes for basic arithmetic between PyTorch 2.6.x and 2.11.x. However, torch.divide() was introduced in 1.8.0; older versions require the `/` operator. All examples here use both modern patterns and are forward-compatible.

Next, learn how to reshape and index tensors: essential for building neural network inputs and manipulating intermediate outputs.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.