Search Strategies for Optimal Decision Trees

Classification and Regression with Continuous Features

Master Thesis (2025)
Author(s)

M. van den Bos (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

E. Demirović – Mentor (TU Delft - Algorithmics)

J.G.M. van der Linden – Mentor (TU Delft - Algorithmics)

Jérémie Decouchant – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
24-09-2025
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering | Embedded Systems']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Interpretable machine learning models, such as decision trees, are needed when decisions require trust. Optimal decision trees are shown to generalise better to new data than those constructed greedily, but due to the NP-hardness of the problem they are hard to apply to large datasets. Previous methods either do not take into account continuous features, are designed for a fixed maximum tree depth of two and three, or do not consider objectives other than classification.

We introduce CODTree, the first specialised branch-and-bound algorithm that finds optimal classification and regression trees with continuous features for an arbitrary maximum depth. Our algorithm is able to run different search strategies and includes a specialised solver for shallow trees. Our experiments show that global best-first search with a heuristic that prioritises smaller nodes with better lower bounds has the best time to optimality and anytime performance, and that the specialised solver for shallow trees provides a geometric mean speedup of 77.4x.

Compared to the state-of-the-art for classification, we have comparable runtime, but our search strategy has four orders of magnitude fewer operations for some datasets, although there are diminishing returns for greater depths. For regression, our algorithm is significantly faster than the state-of-the-art and, to the best of our knowledge, the first to find optimal regression trees at depth four.

Files

Thesis.pdf
(pdf | 4.56 Mb)
License info not available