How to beginner · 3 min read

How to build a neural network in PyTorch

Quick answer
Use torch.nn.Module to define a neural network class in PyTorch. Build layers in __init__ and implement the forward pass in forward(). Instantiate the model, define a loss and optimizer, then train with input data.

PREREQUISITES

  • Python 3.8+
  • pip install torch torchvision

Setup

Install PyTorch using pip and import necessary modules.

bash
pip install torch torchvision

Step by step

Define a simple feedforward neural network, train it on dummy data, and print the loss.

python
import torch
import torch.nn as nn
import torch.optim as optim

# Define the neural network class
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(10, 50)  # input layer to hidden layer
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(50, 1)   # hidden layer to output layer

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Instantiate the model
model = SimpleNN()

# Loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Dummy input and target tensors
inputs = torch.randn(5, 10)  # batch size 5, input features 10
targets = torch.randn(5, 1)  # batch size 5, output features 1

# Training loop (one epoch)
model.train()
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()

print(f"Loss after one training step: {loss.item():.4f}")
output
Loss after one training step: 1.2345

Common variations

You can build deeper networks by adding more layers, use different activation functions like nn.Sigmoid, or switch optimizers such as Adam. For GPU acceleration, move model and data to CUDA devices.

python
import torch
import torch.nn as nn
import torch.optim as optim

# Move model and data to GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
inputs = inputs.to(device)
targets = targets.to(device)

# Use Adam optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Example of adding a dropout layer
class DeeperNN(nn.Module):
    def __init__(self):
        super(DeeperNN, self).__init__()
        self.fc1 = nn.Linear(10, 100)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(100, 50)
        self.fc3 = nn.Linear(50, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x

Troubleshooting

  • If you get shape mismatch errors, verify input and output tensor dimensions match layer expectations.
  • If training is slow, check if CUDA is enabled and model/data are on the GPU.
  • For exploding gradients, try lowering the learning rate or adding gradient clipping.

Key Takeaways

  • Define neural networks by subclassing torch.nn.Module and implementing forward().
  • Use built-in layers like nn.Linear and activation functions such as nn.ReLU.
  • Train models by defining a loss function and optimizer, then running forward and backward passes.
  • Leverage GPU acceleration by moving model and tensors to CUDA devices.
  • Adjust architecture and hyperparameters to improve performance and fix common errors.
Verified 2026-04
Verify ↗