Interpretable models are essential in many machine learning applications, particularly in domains where transparency and trust are critical. Decision trees are a popular interpretable model, but their structure often leads to repeated identical subtrees and data fragmentation, w
...
Interpretable models are essential in many machine learning applications, particularly in domains where transparency and trust are critical. Decision trees are a popular interpretable model, but their structure often leads to repeated identical subtrees and data fragmentation, which can result in large, overfit models with poor generalization. Decision diagrams could offer a more compact and less fragmented alternative by allowing sharing parts of the diagram, but constructing decision diagrams is computationally challenging and has seen limited practical adoption. This thesis introduces a novel local search-based algorithm for learning binary decision diagrams for classification. Our approach finds a middle ground between early greedy methods and exact optimization techniques, enabling scalable construction of compact and accurate diagrams. We define a local search approach with several move operators and explore multiple metaheuristics, identifying hill climbing with information gain-based initialization as the most effective strategy. We refer to this method as Decision Diagram Local Search (DDLS). We evaluate DDLS on 57 real-world datasets from the UCI repository and 480 synthetic datasets generated from known diagrams. Our method achieves competitive or superior accuracy compared to state-of-the-art decision tree and diagram methods on real-world datasets, while producing small models with lower fragmentation. Though challenges remain, especially on complex synthetic datasets, our results suggest that DDLS, and decision diagrams in general, hold significant untapped potential as interpretable classifiers.