What is the effect of Gaussian filtering applied before curve fitting?

Bachelor Thesis (2025)
Author(s)

I. Moanta (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Tom Viering – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

O. Taylan Turan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

C. Yan – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A. Van Deursen – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
31-01-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning curves are graphical representations of the relationship between dataset size and error rate in machine learning. Curve fitting is the process of estimating a learning curve using a mathematical formula. This paper analyzes two ways of performing curve fitting: interpolation and extrapolation. The accuracy of the curve-fitting procedure might be negatively influenced by the irregular shape of the curve and the presence of noise. Our study investigates the effects of the Gaussian filter on curve fitting and the potential to improve its performance. This is done by analyzing multiple values of the Gaussian filter's standard deviation parameter(Sigma) and also a wide variety of learning curves(both smooth and noisy ones). The main finding of this research states that the Gaussian filter can generate significant improvements in the extrapolation process, especially when it is applied to noisy curves. On the other hand, for the interpolation procedure, its impact is reduced, even negligible for smooth curves. An important takeaway from this paper is that selecting the most suitable pre-processing method for the type of curve analyzed might generate valuable findings in the field of learning curves used in machine learning.

Files

License info not available