Optimal Regression Trees via Dynamic Programming
Optimization techniques for learning Regression Trees
M. van den Bos (TU Delft - Electrical Engineering, Mathematics and Computer Science)
E. Demirović – Mentor (TU Delft - Algorithmics)
J.G.M. van der Linden – Mentor (TU Delft - Algorithmics)
Burcu Kulahcioglu Ozkan – Graduation committee member (TU Delft - Software Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Decision trees make decisions in a way interpretable to humans, this is important when machines are increasingly used to aid in making high-stakes and socially sensitive decisions. While heuristics have been used for a long time to find decision trees with reasonable accuracy, recent approaches find fully optimal trees. Due to the computational hardness of finding fully optimal decision trees, it is only practically possible to find shallow trees for a limited dataset size. However, continuous algorithmic improvements keep pushing the scale of feasible solutions. Dynamic Programming approaches promise to find scalable optimal decision trees but need to be adapted for different objectives, such as regression. We combine and adapt the algorithmic techniques of two Dynamic Programming methods, creating a new method that improves the scalability of optimal regression trees. This new method often achieves an order of magnitude speed improvement over a previous state-of-the-art method.