Reaching for Resilience: Understanding How Optimizers Affect the Stability Gap in Continual Learning

Bachelor Thesis (2025)
Author(s)

C. Obis (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

G.M. van de Ven – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Tom Julian Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A Hanjalic – Graduation committee member (TU Delft - Intelligent Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
25-06-2025
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the context of continual learning, recent work has identified a significant and recurring perfor- mance drop, followed by a gradual recovery, upon the introduction of a new task. This phenomenon is referred to as the stability gap. Investigating it and the potential solutions is essential, as such findings can reduce both the energy consumption and computational time required to prepare a high- performing agent. Given the strong influence of training procedures on model performance and sta- bility, we analyze how various optimizers –SGD, NAG, AdaGrad, RMSprop, Adam– and momentum values affect the stability gap. We expose a deep neural network to a sequence of digit-identification tasks with varying rotations, and track several met- rics to capture the components of the stability gap and the overall performance. Our results reveal that increasing momentum amplifies the steepness and depth of the gap, while shortening its duration. Within this simplified setup, RMSprop proves most effective in reducing the magnitude and duration of the drop while maintaining high overall perfor- mance.

Files

License info not available