Effectiveness of Machine Learning Models in Classifying Learners Based on Learning Curves
Improving Our Understanding of Learning Curves Through the Process of Classification
S. Basaran (TU Delft - Electrical Engineering, Mathematics and Computer Science)
C. Yan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
S. Mukherjee – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
T.J. Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Matthijs T.J. Spaan – Graduation committee member (TU Delft - Sequential Decision Making)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In machine learning, learning curves are a metric that plots performance versus training set size. They inform decisions about data acquisition, model selection, and hyperparameter tuning. Despite their importance, recent research suggests that our understanding of learning curve behavior remains limited. In this work, we investigate learning curves from a classification perspective to better understand their structural properties. By framing learning curves as time series and applying time series classification (TSC) techniques, we uncover several key findings: (1) training accuracy curves are significantly more distinguishable across models than validation or test curves; (2) learning curves become more informative and discriminative after a sufficient number of anchor points; and (3) TSC models that emphasize global structural features outperform those focused on local or pointwise characteristics. These results not only offer new insights into the nature of learning curves but also suggest promising directions for future work, including the development of specialized models that move beyond conventional time series assumptions.