Print Email Facebook Twitter Learning Curves Title Learning Curves: How do Data Imbalances affect the Learning Curves using Nearest Mean Model? Author Feng, Kevin (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Viering, T.J. (mentor) Turan, O.T. (mentor) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2024-02-01 Abstract This research investigates the impact of data imbalances on the learning curve using the nearest mean model. Learning curves are useful to represent the performance of the model as the training size increases. Imbalanced datasets are often encountered in real-life scenarios and pose challenges to standard classifier models impacting their performance. Thus, the research question is ”How do data imbalances affect the learning curves using the nearest mean model?”. To answer the question, an experiment is conducted using data from a multivariate Gaussian distribution to sample varying levels of imbalances. The imbalance ratio explored is [0.1, 0.2, 0.3, 0.4, 0.5], representing the percentage of the dataset that consists of the minority class. The findings indicated that as the data becomes more imbalanced, the learning curves reach the accuracy plateau at a later rate. The analysis of the curve parameter which follows the logistic function suggests that imbalances have an impact on the maximum achievable accuracy and rightward shift of the curves. However, the maximum achievable accuracy is non-significant and the shape of the curves remains similar. Additionally, false negatives have a significant impact on the learning curves. Subject Learning CurveImbalance Data setsMachine Learning To reference this document use: http://resolver.tudelft.nl/uuid:09603d28-aa2b-48a5-9f61-095fa1084e57 Part of collection Student theses Document type bachelor thesis Rights © 2024 Kevin Feng Files PDF 5293200_Research_Report.pdf 1.57 MB Close viewer /islandora/object/uuid:09603d28-aa2b-48a5-9f61-095fa1084e57/datastream/OBJ/view