Effectiveness of Machine Learning Models in Classifying Learners Based on Learning Curves

None, None

Effectiveness of Machine Learning Models in Classifying Learners Based on Learning Curves

Improving Our Understanding of Learning Curves Through the Process of Classification

Bachelor Thesis (2025)

Author(s)

S. Basaran (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Yan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S. Mukherjee – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

T.J. Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Matthijs T.J. Spaan – Graduation committee member (TU Delft - Sequential Decision Making)

Faculty

Electrical Engineering, Mathematics and Computer Science

Machine learning Classification Time series Learning curve Extrapolation Tsc

To reference this document use:

https://resolver.tudelft.nl/uuid:207f6693-f34c-4404-a309-e0bd35334981

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In machine learning, learning curves are a metric that plots performance versus training set size. They inform decisions about data acquisition, model selection, and hyperparameter tuning. Despite their importance, recent research suggests that our understanding of learning curve behavior remains limited. In this work, we investigate learning curves from a classification perspective to better understand their structural properties. By framing learning curves as time series and applying time series classification (TSC) techniques, we uncover several key findings: (1) training accuracy curves are significantly more distinguishable across models than validation or test curves; (2) learning curves become more informative and discriminative after a sufficient number of anchor points; and (3) TSC models that emphasize global structural features outperform those focused on local or pointwise characteristics. These results not only offer new insights into the nature of learning curves but also suggest promising directions for future work, including the development of specialized models that move beyond conventional time series assumptions.

Files

FINAL_Research_Paper_5771129.p... (pdf)

(pdf | 2.62 Mb)

License info not available