Extrapolating Learning Curves: When Do Neural Networks Outperform Parametric Models?

Bachelor Thesis (2025)
Author(s)

A. Cazacu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

T.J. Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

C. Yan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S. Mukherjee – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Matthijs T.J. Spaan – Graduation committee member (TU Delft - Sequential Decision Making)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
27-06-2025
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning curve extrapolation helps practitioners predict model performance at larger data scales, enabling better planning for data collection and computational resource allocation. This paper investigates when neural networks outperform parametric models for this task. We conduct a comprehensive comparison of LC-PFNs (Learning Curve Prior-Fitted Networks) and three established parametric models (POW4, MMF4, WBL4) using LCDB v1.1, a large-scale dataset with learning curves generated across 265 classification tasks and 24 learners. Surprisingly, we find that parametric models — especially POW4 and MMF4 — consistently outperform LC-PFN across all generalization scenarios and most cutoff regions. However, LC-PFN demonstrates competitive performance when extrapolating from early-stage data, ranking second-best at 10%, 30%, and 50% cutoffs. This suggests LC-PFNs can be valuable when only a small fraction of the learning curve is available. LC-PFN is particularly challenged by smooth and flat curves, but shows slightly improved performance on irregular patterns such as peaking and dipping curves, though it remains outperformed by all parametric models. These trends highlight a misalignment between LC-PFN’s training distribution and the real-world diversity of learning curves. Our findings emphasize the strength of parametric models under realistic conditions and suggest avenues for improving LC-PFNs through architectural flexibility and curve length variability during training.

Files

License info not available