I Fought the Low
Decreasing Stability Gap with Neuronal Decay
K. Zhankov (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G.M. van de Ven – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Tom Julian Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
A Hanjalic – Graduation committee member (TU Delft - Intelligent Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Task-based continual learning setups suffer from temporary dips in performance shortly after switching to new tasks, a phenomenon referred to as stability gap. State-of-the-art methods that considerably mitigate catastrophic forgetting do not necessarily decrease the stability gap well. One notable continual learning regularization approach, neuronal decay, attempts to encourage learning solutions that have small activations in the hidden layers. It previously showed improvement in terms of catastrophic forgetting but was not assessed in the context of stability gap. In this study, we compare neuronal decay with a baseline model to see if it can reduce the stability gap. Qualitative analysis with plots and quantitative analysis with metrics, such as gap depth, time-to-recover and average accuracy, both give strong evidence that this simple regularization method can reduce the stability gap with no substantial sacrifice of performance or training time.
The source code is available at https://github.com/zkkv/neuronal-decay.