Print Email Facebook Twitter Filtering Knowledge: A Comparative Analysis of Information-Theoretical-Based Feature Selection Methods Title Filtering Knowledge: A Comparative Analysis of Information-Theoretical-Based Feature Selection Methods Author Vasilev, Kiril (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Web Information Systems) Contributor Katsifodimos, A (mentor) Ionescu, A. (mentor) Isufi, E. (graduation committee) Degree granting institution Delft University of Technology Corporate name Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-06-28 Abstract The data used in machine learning algorithms strongly influences the algorithms' capabilities. Feature selection techniques can choose a set of columns that meet a certain learning goal. There is a wide variety of feature selection methods, however, the ones we cover in this comparative analysis are part of the information-theoretical-based family. We evaluate MIFS, MRMR, CIFE, and JMI using the machine learning algorithms Logistic Regression, XGBoost, and Support Vector Machines.Multiple datasets with a variety of feature types are used during evaluation. We find that MIFS and MRMR are 2-4 times faster than CIFE and JMI. MRMR and JMI choose columns that lead to significantly higher accuracy and lower root mean squared error earlier. The results we present here can help data scientists pick the right feature selection method depending on the datasets used. Subject Data AugmentationFeature SelectionInformation TheoryComparative analysis To reference this document use: http://resolver.tudelft.nl/uuid:fbcf96d8-3685-4838-85e6-ee6887c25e15 Part of collection Student theses Document type bachelor thesis Rights © 2023 Kiril Vasilev Files PDF kiril_vasilev_filtering_k ... wledge.pdf 1.02 MB Close viewer /islandora/object/uuid:fbcf96d8-3685-4838-85e6-ee6887c25e15/datastream/OBJ/view