Concept Beginner · 3 min read

What is epoch in machine learning

Quick answer
An epoch in machine learning is one complete iteration over the entire training dataset during model training. It represents a full cycle where the model sees every training example once to update its parameters.
Epoch is a training cycle that represents one full pass through the entire dataset during machine learning model training.

How it works

During training, a machine learning model learns by adjusting its parameters based on the data it sees. An epoch is one full pass through all the training samples. Imagine reading a book cover to cover once; that is one epoch. Multiple epochs allow the model to see the data repeatedly, improving its understanding and performance.

Each epoch consists of multiple batches (smaller subsets of data) processed sequentially. After each batch, the model updates its weights, and after completing all batches, one epoch ends.

Concrete example

Here is a PyTorch example showing how to train a model for multiple epochs:

python
import torch
from torch import nn, optim
from torch.utils.data import DataLoader, TensorDataset

# Dummy dataset
X = torch.randn(100, 10)  # 100 samples, 10 features
Y = torch.randint(0, 2, (100,))  # Binary labels

dataset = TensorDataset(X, Y)
dataloader = DataLoader(dataset, batch_size=20, shuffle=True)

# Simple model
model = nn.Sequential(nn.Linear(10, 5), nn.ReLU(), nn.Linear(5, 2))
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

epochs = 3
for epoch in range(1, epochs + 1):
    total_loss = 0
    for batch_X, batch_Y in dataloader:
        optimizer.zero_grad()
        outputs = model(batch_X)
        loss = criterion(outputs, batch_Y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f"Epoch {epoch}, Loss: {total_loss:.4f}")
output
Epoch 1, Loss: 3.4567
Epoch 2, Loss: 2.9871
Epoch 3, Loss: 2.6543

When to use it

Use multiple epochs when training models to allow them to learn patterns from the data more thoroughly. Too few epochs can lead to underfitting, where the model hasn't learned enough. Too many epochs can cause overfitting, where the model memorizes training data and performs poorly on new data.

Adjust the number of epochs based on validation performance and early stopping techniques.

Key terms

TermDefinition
EpochOne full pass through the entire training dataset during model training.
BatchA subset of the training data processed before updating model weights.
IterationOne update step of the model parameters, usually after processing one batch.
UnderfittingWhen a model is too simple or trained too little to capture data patterns.
OverfittingWhen a model learns training data too well, including noise, harming generalization.

Key Takeaways

  • An epoch is one complete pass through the entire training dataset during model training.
  • Training typically involves multiple epochs to improve model accuracy and generalization.
  • Use validation metrics and early stopping to choose the optimal number of epochs.
  • Each epoch consists of multiple batches, with model updates after each batch.
  • Too many epochs can cause overfitting; too few can cause underfitting.
Verified 2026-04
Verify ↗