Cost-sensitive optimal decision trees dealing with delays

Bachelor Thesis (2023)
Author(s)

S.C. Butzelaar (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Koos van der Linden – Mentor (TU Delft - Algorithmics)

Emir Demirović – Mentor (TU Delft - Algorithmics)

Frans A. Oliehoek – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Sven Butzelaar
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Sven Butzelaar
Graduation Date
03-02-2023
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Machine learning can be used to classify patients in a hospital. Here, the classifier has to minimize the cost of misclassifying the patient and minimize the costs of the tests. Unfortunately, obtaining features may be costly, e.g., taking blood tests or doing an x-ray scan. Furthermore, it is possible that acquiring those test results may take a few days. To train such a classifier, several machine learning algorithms exist. Decision trees appear as favourites since, unlike other classifiers, decision trees do not need all feature values to classify an instance. Current approaches, however, only use heuristics to find a local optimum. Although heuristics are relatively fast, they are not optimal and therefore may not capture well the underlying characteristics of the given dataset. We propose an optimal approach to train cost-sensitive decision trees while also considering these delayed tests. Here we show, smaller trees with higher accuracy and a lower cost can be constructed, compared to a heuristic approach. We use dynamic programming to allow us to skip many calculations, which speeds up the programming time to seconds.

Files

Research_paper_final.pdf
(pdf | 0.657 Mb)
License info not available