Clustering Learning Curves in Machine Learning using K-Means Algorithm

Can patterns be identified amongst learning curves after the application of the K-Means algorithm using point and statistical vectors?

More Info
expand_more

Abstract

A learning curve can serve as an indicator of the “performance of trained models versus the training set size” [1]. Recent research on learning curve analysis has been carried out within the Learning Curve Database (LCDB) [2] This paper will investigate if there are similarities amongst these curves by clustering those provided by the LCDB. The experiment employs two distinct input parameters: point vectors and statistical vectors. By conducting individual learner analysis, individual dataset analysis, principal component analysis, and other experiments, patterns are isolated for both input sets. Upon further exploration of shapes and distributions, the concluding remark is that the point vector clustering produced one key concrete pattern amongst certain learning techniques. In contrast, the statistical vector findings are more inconclusive and do not exhibit a clear distinction that could be linked to any dominant patterns.