Efficient and effective feature discovery for CART decision tree model

Bachelor Thesis (2022)
Author(s)

A.B.C. Bien (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. Hai – Mentor (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Benedict Bien
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Benedict Bien
Graduation Date
22-06-2022
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract


A common challenge in feature discovery and feature selection is the trade-off between effectiveness and efficiency. The paper proposes a solution that is efficient and effective at ranking features for feature discovery.
This paper aims to improve feature discovery techniques, by estimating the overall utility of features, through ranking them by their characteristics, such as the correlation coefficient, gini impurity, information gain, etc. The approach to estimate the overall utility is done by calculating the likelihoods of a feature being selected with a wrapper feature selection technique, given their ranking with respect to their characteristics. The likelihoods of the rankings are recorded and combined to estimate the overall utility of a feature which is used to rank all the features by their utility.

Files

Mandatory_cover_Page.pdf
(pdf | 0.593 Mb)
License info not available