How to build a neural network in PyTorch
Quick answer
Use
torch.nn.Module to define a neural network class in PyTorch. Build layers in __init__ and implement the forward pass in forward(). Instantiate the model, define a loss and optimizer, then train with input data.PREREQUISITES
Python 3.8+pip install torch torchvision
Setup
Install PyTorch using pip and import necessary modules.
pip install torch torchvision Step by step
Define a simple feedforward neural network, train it on dummy data, and print the loss.
import torch
import torch.nn as nn
import torch.optim as optim
# Define the neural network class
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(10, 50) # input layer to hidden layer
self.relu = nn.ReLU()
self.fc2 = nn.Linear(50, 1) # hidden layer to output layer
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Instantiate the model
model = SimpleNN()
# Loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
# Dummy input and target tensors
inputs = torch.randn(5, 10) # batch size 5, input features 10
targets = torch.randn(5, 1) # batch size 5, output features 1
# Training loop (one epoch)
model.train()
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
print(f"Loss after one training step: {loss.item():.4f}") output
Loss after one training step: 1.2345
Common variations
You can build deeper networks by adding more layers, use different activation functions like nn.Sigmoid, or switch optimizers such as Adam. For GPU acceleration, move model and data to CUDA devices.
import torch
import torch.nn as nn
import torch.optim as optim
# Move model and data to GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
inputs = inputs.to(device)
targets = targets.to(device)
# Use Adam optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Example of adding a dropout layer
class DeeperNN(nn.Module):
def __init__(self):
super(DeeperNN, self).__init__()
self.fc1 = nn.Linear(10, 100)
self.relu = nn.ReLU()
self.dropout = nn.Dropout(0.5)
self.fc2 = nn.Linear(100, 50)
self.fc3 = nn.Linear(50, 1)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.dropout(x)
x = self.fc2(x)
x = self.relu(x)
x = self.fc3(x)
return x Troubleshooting
- If you get shape mismatch errors, verify input and output tensor dimensions match layer expectations.
- If training is slow, check if CUDA is enabled and model/data are on the GPU.
- For exploding gradients, try lowering the learning rate or adding gradient clipping.
Key Takeaways
- Define neural networks by subclassing
torch.nn.Moduleand implementingforward(). - Use built-in layers like
nn.Linearand activation functions such asnn.ReLU. - Train models by defining a loss function and optimizer, then running forward and backward passes.
- Leverage GPU acceleration by moving model and tensors to CUDA devices.
- Adjust architecture and hyperparameters to improve performance and fix common errors.