Exploiting multi-core parallelism on optimal decision trees

Bachelor Thesis (2021)
Author(s)

Ayush Patandin (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Emir Demirovic – Mentor (TU Delft - Algorithmics)

Johan Pouwelse – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Ayush Patandin
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Ayush Patandin
Graduation Date
02-07-2021
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper presents a study that discusses how multi-threading can be used to improve the runtime performance of constructing optimal classification trees. Decision trees are popular for solving classification or regression problems in machine learning. Heuristic methods are used to build decision tree algorithms that produce models of high accuracy within a short amount of time. An important limitation is that these heuristics locally optimize the decisions of the tree model. Consequently, in recent years, optimal classification tree algorithms have been introduced to strive for global optimality when learning decision trees. Unfortunately, the runtimes for constructing optimal decision trees are quite larger in comparison with the runtimes obtained from heuristic solutions. The study provides a mitigation for this by parallelizing the work of a recently invented optimal decision tree algorithm on multiple cores. There exist different parallel techniques to divide and schedule the work among processors. Our strategy follows the parallel approach that computes optimal decision trees using threads as processing elements in a shared memory space. In the end, we provide the experimental study to show that impressive runtime results of the optimal decision tree algorithm are successfully obtained with the help of the parallelization strategy.

Files

License info not available