Print Email Facebook Twitter Variable importance measures for random forests Title Variable importance measures for random forests Author Boon, Cindy (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Parolya, N. (mentor) Ferreira, José (graduation committee) Kurowicka, D. (graduation committee) Degree granting institution Delft University of Technology Programme Applied Mathematics Date 2021-07-09 Abstract Measuring variable importance is often a difficult task: among others models can be complex and covariates can interact with each other and can be correlated. This study focuses on two questions: First, what should be the theoretical measure of variable importance under a given data-generating model? And second, what are the best estimates of these theoretical measures? Two theoretical measures and some corresponding estimates are presented of which one is the well-known random forests variable importance measure (Breiman, 2001). A simulation study is done for both linear and nonlinear models to find out what are the best estimates of variable importance measures for given data-generating models. Most measures struggle when covariates are correlated, but make an improvement in performance when the number of split variables is tuned. Subject Random ForestsVariable importance measuresCorrelation To reference this document use: http://resolver.tudelft.nl/uuid:bc1b0369-efea-47e6-95e1-ece779ce736a Part of collection Student theses Document type master thesis Rights © 2021 Cindy Boon Files PDF Master_thesis_report_CJM_Boon.pdf 1.69 MB Close viewer /islandora/object/uuid:bc1b0369-efea-47e6-95e1-ece779ce736a/datastream/OBJ/view