Deterministic Finite Automata (DFAs) are interpretable classification models, typically learned through merging states of a large tree-like automaton, an Augmented Prefix Tree Acceptor (APTA), according to heuristic suitability metrics. This paper introduces an ensembling approac
...
Deterministic Finite Automata (DFAs) are interpretable classification models, typically learned through merging states of a large tree-like automaton, an Augmented Prefix Tree Acceptor (APTA), according to heuristic suitability metrics. This paper introduces an ensembling approach for DFAs that does not depend on such heuristics. Starting from the APTA, we construct diverse automata by applying randomized sequences of state merges, while avoiding repetition of merges whenever possible. We also propose a novel graph-connectivity-based metric for inter-model variety. Experimental results on the STAMINA competition datasets yield improved predictive performance compared to models learned using state-of-the-art heuristics on sparse datasets, as well as a tight connection between inter-model variety and performance.