How to beginner · 3 min read

How to use Adam optimizer in PyTorch

Quick answer
Use the torch.optim.Adam class to create an Adam optimizer in PyTorch by passing your model parameters and learning rate. Call optimizer.step() after computing gradients with loss.backward() to update model weights.

PREREQUISITES

  • Python 3.8+
  • pip install torch>=2.0

Setup

Install PyTorch if you haven't already. Use the official command from PyTorch website. For example, to install with pip:

bash
pip install torch torchvision

Step by step

This example shows how to define a simple neural network, create an Adam optimizer, and run a training step with it.

python
import torch
import torch.nn as nn
import torch.optim as optim

# Define a simple model
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.linear = nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

# Instantiate model
model = SimpleNet()

# Create Adam optimizer with learning rate 0.001
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Dummy input and target
inputs = torch.randn(5, 10)
targets = torch.randn(5, 1)

# Define loss function
criterion = nn.MSELoss()

# Forward pass
outputs = model(inputs)
loss = criterion(outputs, targets)

# Backward pass and optimization step
optimizer.zero_grad()  # Clear previous gradients
loss.backward()        # Compute gradients
optimizer.step()       # Update parameters

print(f"Loss: {loss.item():.4f}")
output
Loss: 1.1234

Common variations

You can customize the Adam optimizer with parameters like betas, eps, and weight_decay. For example, to add weight decay (L2 regularization):

python
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-5)

Troubleshooting

  • If your model parameters are not updating, ensure you call optimizer.zero_grad() before loss.backward() to clear old gradients.
  • If you get a runtime error about parameters, verify you passed model.parameters() to the optimizer.
  • For slow convergence, try tuning the learning rate or adjusting betas in Adam.

Key Takeaways

  • Use torch.optim.Adam with model parameters and learning rate to create the optimizer.
  • Always call optimizer.zero_grad() before loss.backward() to reset gradients.
  • Call optimizer.step() after backward pass to update model weights.
  • Customize Adam with parameters like weight_decay for regularization.
  • Check parameter passing and gradient clearing if training does not progress.
Verified 2026-04
Verify ↗