RuntimeError
torch.nn.modules.module.RuntimeError
Stack trace
RuntimeError: Error(s) in loading state_dict for MyModel: Missing key(s) in state_dict: "layer1.weight", "layer1.bias"... Unexpected key(s) in state_dict: "fc.weight", "fc.bias"...
Why it happens
This error occurs because the keys in the loaded state_dict do not exactly match the keys expected by the model's architecture. This can happen if the model definition changed, if you load weights from a different model, or if the saved checkpoint includes extra or missing layers.
Detection
Check the keys of your model's state_dict and the checkpoint's state_dict before loading. Use assertions or print statements to compare keys and catch mismatches early.
Causes & fixes
Model architecture changed after saving the checkpoint, causing key mismatches.
Ensure the model definition matches exactly the one used to save the checkpoint before loading the state_dict.
Loading a checkpoint from a different model or pretrained weights with incompatible keys.
Verify the checkpoint corresponds to the correct model architecture or use strict=False in load_state_dict to ignore mismatched keys.
Checkpoint contains extra keys not present in the current model (e.g., from a larger model).
Use model.load_state_dict(checkpoint, strict=False) to allow missing or unexpected keys without raising an error.
Partial model loading without handling missing keys explicitly.
Manually filter the checkpoint dictionary to include only keys present in the model before loading.
Code: broken vs fixed
import torch
model = MyModel()
checkpoint = torch.load('model.pth')
model.load_state_dict(checkpoint) # Raises RuntimeError missing keys import os
import torch
os.environ['TORCH_HOME'] = '/tmp/torch_cache' # Example env var usage
model = MyModel()
checkpoint = torch.load('model.pth')
model.load_state_dict(checkpoint, strict=False) # Fixed: allow missing keys
print('Model loaded with missing keys ignored') Workaround
Wrap load_state_dict in try/except RuntimeError, then manually filter checkpoint keys to match model keys before retrying load.
Prevention
Maintain consistent model architecture versions and checkpoint formats; use strict=False for flexible loading during development or transfer learning.