Ensemble techniques for (P)DFA learning
Effect of changing the sequence orders on DFA ensembles learned via EDSM
W.M. Cupiał (TU Delft - Electrical Engineering, Mathematics and Computer Science)
S.E. Verwer – Mentor (TU Delft - Algorithmics)
Simon Dieck – Mentor (TU Delft - Algorithmics)
N.M. Gürel – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Learning a Deterministic Finite Automaton (DFA) from a language sample is an essential problem in grammatical inference, with applications in various fields, such as modeling and analyzing software systems. In this work, we propose approaches to create an ensemble of DFAs learned with the Evidence Driven State Merging algorithm. To produce varying models from the given data, we introduce two algorithms for manipulating the sequence orders. Additionally, we propose a similarity metric that allows for reducing the ensemble size by discarding similar models. The proposed approaches were analyzed and empirically evaluated using the dataset used during the StaMinA competition. Experimental results demonstrate that the methods for obtaining ensembles of DFAs presented in this work provide a number of advantages over the single DFA learned from the classical prefix tree acceptor using EDSM.