”How Much Data is Enough?” Learning curves for machine learning

Investigating alternatives to the Levenberg-Marquardt algorithm for learning curve extrapolation

Bachelor thesis (2023)

Authors

L. Negru Electrical Engineering, Mathematics and Computer Science

Contributors

J.H. Krijthe Pattern Recognition and Bioinformatics - (mentor)

T.J. Viering Pattern Recognition and Bioinformatics - (mentor)

Z. Yue (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:5786dcac-2dee-453e-93ba-bae7aafe260e

More Info

expand_more

Published Date

28-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The conducted research explores fitting algorithms for learning curves. Learning curves describe how the performance of a machine learning model changes with the size of the training input. Therefore, fitting these learning curves and extrapolating them can help determine the required data set size for any desired performance.

The paper specifically explores the Learning Curve Database (LCDB) and investigates alternative fitting algorithms to the employed Levenberg-Marquardt (LM). These algorithms are Gradient Descent and BFGS, and the paper aims to determine whether they are more suitable for fitting learning curves than LM.

The algorithms were implemented, both in their default and optimised states, and the results were compared to LM. The results measured mean-squared error (MSE), L1 Loss, individual parametric model performance, and computation time.

The findings showed that Gradient Descent is not a suitable alternative to LM; however, BFGS proved to be competitive, as it is practically identical in performance while being significantly faster than LM. The results answered the proposed aim of the paper and generated new questions that need answering.

Further exploration of the BFGS algorithm and its application on learning curve fitting is recommended. Comparisons between the MSE distribution of LM and BFGS can be further explored, as well as comparisons on new parametric models, learners, and datasets.

Files

CSE3000_Lucian_Negru_Final_.pd... (pdf)

(pdf | 0.374 Mb)