Code Beginner easy · 4 min

What a tensor is: N-dimensional array

What you will learn

A tensor is PyTorch's fundamental data structure: a generalized N-dimensional array that can live on CPU or GPU and tracks computation for automatic differentiation.

Why this matters

Everything in PyTorch is a tensor: model weights, inputs, outputs, gradients. Understanding how tensors work at the mechanical level is prerequisite to building any neural network or training loop.

Skip if: You don't need tensors if you're just doing NumPy array operations for data science: use NumPy directly. Only move to tensors when you need GPU acceleration or gradient computation.

Explanation

A tensor is PyTorch's core data structure: think of it as a generalization of a NumPy array that can live on either CPU or GPU and automatically compute gradients. A 0D tensor is a scalar (single number), a 1D tensor is a vector, a 2D tensor is a matrix, and 3D+ tensors are higher-dimensional arrays. Mechanically, when you create a tensor with torch.tensor(), PyTorch allocates memory, stores the data, and wraps it with metadata (shape, dtype, device, requires_grad flag). Unlike NumPy, tensors can track the mathematical operations performed on them so that gradients can flow backward through a computation graph during training. You use tensors whenever you need GPU computation, automatic differentiation, or both: which is essentially every neural network training loop.

Analogy

If NumPy arrays are like raw plywood, tensors are like smart lumber that remembers every tool used to shape it, so you can rewind and improve your craftsmanship.

Code

python

import torch

scalar = torch.tensor(5.0)
print(f"Scalar shape: {scalar.shape}, value: {scalar}")

vector = torch.tensor([1.0, 2.0, 3.0])
print(f"Vector shape: {vector.shape}")
print(f"Vector: {vector}")

matrix = torch.tensor([[1.0, 2.0, 3.0],
                       [4.0, 5.0, 6.0]])
print(f"Matrix shape: {matrix.shape}")
print(f"Matrix:\n{matrix}")

tensor_3d = torch.randn(2, 3, 4)
print(f"3D tensor shape: {tensor_3d.shape}")

tensor_gpu = torch.tensor([1.0, 2.0, 3.0])
if torch.cuda.is_available():
    tensor_gpu = tensor_gpu.to('cuda')
    print(f"Tensor on device: {tensor_gpu.device}")
else:
    print(f"Tensor on device: {tensor_gpu.device}")

grad_tensor = torch.tensor([2.0, 3.0, 4.0], requires_grad=True)
print(f"\nGrad tracking enabled: {grad_tensor.requires_grad}")
print(f"Dtype: {grad_tensor.dtype}")

Output

Scalar shape: torch.Size([]), value: tensor(5.)
Vector shape: torch.Size([3])
Vector: tensor([1., 2., 3.])
Matrix shape: torch.Size([2, 3])
Matrix:
tensor([[1., 2., 3.],
        [4., 5., 6.]])
3D tensor shape: torch.Size([2, 3, 4])
Tensor on device: cpu

Grad tracking enabled: True
Dtype: torch.float32

What just happened?

The code created tensors of increasing dimensionality (scalar, vector, 2D matrix, 3D array), printed their shapes and values, checked GPU availability (not present in this environment, so stayed on CPU), and demonstrated the <code>requires_grad=True</code> flag that tells PyTorch to track operations on that tensor for backpropagation. Each tensor printed shows its shape (dimensions) and dtype (data type).

Common gotcha

Developers often confuse tensor shape with tensor size: they're the same thing. More critically, a tensor created from a Python list defaults to dtype=torch.float32 for floating-point numbers. If you do arithmetic with integer tensors and expect floats, you'll get integer results and lose precision: always be explicit about dtype when it matters, e.g., torch.tensor([1, 2, 3], dtype=torch.float32).

Error recovery

RuntimeError: Expected a scalar tensor

You passed a tensor where PyTorch expected a single value. Solution: use <code>.item()</code> to extract the Python scalar, e.g., <code>loss_value = loss.item()</code>.

TypeError: can't convert cuda:0 device type tensor to numpy

You're trying to call <code>.numpy()</code> on a GPU tensor. GPU memory and NumPy can't talk directly. Solution: move to CPU first: <code>tensor.cpu().numpy()</code>.

RuntimeError: leaf variable has been moved into the graph interior

You modified a tensor that requires gradients in-place (e.g., <code>tensor[0] = 5.0</code>). Solution: don't modify tensors with <code>requires_grad=True</code> in-place: create a new tensor instead with <code>tensor = tensor.clone(); tensor[0] = 5.0</code>.

Experienced dev note

A tensor is not just data: it's a node in a computation graph. The moment you create a tensor with requires_grad=True, PyTorch builds a linked list of every operation that touches it. This is powerful for training but silent memory overhead if you're not careful. Always use with torch.no_grad(): around inference code and explicit .detach() when you want to break the graph. Many performance bugs come from accidentally keeping gradients alive on tensors you don't need to train.

Check your understanding

If you create a tensor with shape (3, 4) and perform an in-place operation like t += 1 on it, and requires_grad=True is set, what happens and why would you get an error?

Show answer hint

In-place operations modify a tensor's values without creating a new tensor. PyTorch forbids in-place ops on leaf variables that require gradients because it breaks the computation graph: the gradient function would not know what the old value was. The correct approach is to avoid in-place operations or use <code>.detach()</code> first.

VERSION PyTorch 2.6.x and earlier used torch.Variable() to wrap tensors for gradient tracking. Since 2.0.0, all tensors are Variables by default: no wrapping needed. Tensors automatically track gradients if requires_grad=True is set. The old pattern torch.Variable(tensor, requires_grad=True) is dead.

Once you understand tensors, the next natural step is learning how to manipulate them: reshaping, indexing, slicing, and basic arithmetic operations with shape broadcasting.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.