Layerwise Perspective into Continual Backpropagation
Replacing the First Layer is All You Need
A. Jučas (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J.W. Böhmer – Mentor (TU Delft - Sequential Decision Making)
L.R. Engwegen – Mentor (TU Delft - Sequential Decision Making)
M. Khosla – Graduation committee member (TU Delft - Multimedia Computing)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Continual learning faces a problem, known as plasticity loss, where models gradually lose the ability to adapt to new tasks. We investigate Continual Backpropagation (CBP) – a method that tackles plasticity loss by constantly resetting a small fraction of low-utility neurons. We find that resetting neurons in deeper layers gives increasingly worse performance, with exclusively first-layer resets achieving performance very close to regular CBP. We confirm this phenomenon holds across different models. Additionally, we find an underlying reason for this phenomenon: first-layer resets prevent continual growth in weight magnitudes, which is crucial for maintaining plasticity, while not resetting the first layer results in strong weight growth. Additionally, we find that CBP fails under models based on non-ReLU activations, which is a novel result.