Filtering Knowledge: A Comparative Analysis of Information-Theoretical-Based Feature Selection Methods
K.V. Vasilev (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Asterios Katsifodimos – Mentor (TU Delft - Web Information Systems)
A. Ionescu – Mentor (TU Delft - Web Information Systems)
Elvin Isufi – Graduation committee member (TU Delft - Multimedia Computing)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The data used in machine learning algorithms strongly influences the algorithms' capabilities. Feature selection techniques can choose a set of columns that meet a certain learning goal. There is a wide variety of feature selection methods, however, the ones we cover in this comparative analysis are part of the information-theoretical-based family. We evaluate MIFS, MRMR, CIFE, and JMI using the machine learning algorithms Logistic Regression, XGBoost, and Support Vector Machines.
Multiple datasets with a variety of feature types are used during evaluation. We find that MIFS and MRMR are 2-4 times faster than CIFE and JMI. MRMR and JMI choose columns that lead to significantly higher accuracy and lower root mean squared error earlier. The results we present here can help data scientists pick the right feature selection method depending on the datasets used.