Maintaining Plasticity for Deep Continual Learning
Activation Function-Adapted Parameter Resetting Approaches
V. Purice (TU Delft - Electrical Engineering, Mathematics and Computer Science)
L.R. Engwegen – Mentor (TU Delft - Sequential Decision Making)
Wendelin Böhmer – Mentor (TU Delft - Sequential Decision Making)
Megha Khosla – Graduation committee member (TU Delft - Multimedia Computing)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Standard deep learning utensils, in particular feed-forward artificial neural networks and the backpropagation algorithm, fail to adapt to sequential learning scenarios, where the model is continuously presented with new training data. Many algorithms that aim to solve this problem exist, but their performance is heavily influenced by factors such as the properties of the environment, the non-stationarity of the input/output data, and the intrinsic characteristics of the utilised models. In this paper, we design an activation function-adapted framework for reinitializing neurons in continual learning, which aims to preserve the network's ability to learn and adjust to new data. A novel utility measure is introduced, which estimates the activation value of each neuron. The proposed strategy selectively reinitializes neurons exhibiting the lowest and highest activation values, which are typically detrimental to the learning performance, particularly in continual learning contexts. We evaluate the proposed framework across different scenarios using various activation functions and show that simple strategies---when well-matched to the model's activation function---can effectively mitigate plasticity loss in simple supervised learning tasks.