HR
H. Radu
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
1 records found
1
A theoretical analysis of optimal and heuristic methods for DFA learning
Bachelor’s Degree Thesis
Deterministic finite automata (DFA) are interpretable models used for classification and prediction tasks based on sequence data. They often act as surrogate models for software systems. Plenty of methods exist for the purpose of DFA learning. Examples include optimal algorithms such as SAT-based encoding and various heuristic methods as the likes of the BlueFringe framework of the EDSM algorithm. By definition, optimal algorithms can guarantee a minimal DFA consistent with the training data, but this does not exclude the possibility of heuristics also finding the optimal solution. However, it is generally believed that optimal methods could always require strictly less data to learn such a minimal model than their counterparts. In our research, we provide mathematical proofs and counter-examples that show the above statement to be false. We further demonstrate that, unless formally defined, there exist numerous languages and settings where heuristics outperform optimal methods on data efficiency benchmarks. Finally, we prove that optimal methods are equal to the BlueFringe framework in terms of optimistic learning efficiency.
...
Deterministic finite automata (DFA) are interpretable models used for classification and prediction tasks based on sequence data. They often act as surrogate models for software systems. Plenty of methods exist for the purpose of DFA learning. Examples include optimal algorithms such as SAT-based encoding and various heuristic methods as the likes of the BlueFringe framework of the EDSM algorithm. By definition, optimal algorithms can guarantee a minimal DFA consistent with the training data, but this does not exclude the possibility of heuristics also finding the optimal solution. However, it is generally believed that optimal methods could always require strictly less data to learn such a minimal model than their counterparts. In our research, we provide mathematical proofs and counter-examples that show the above statement to be false. We further demonstrate that, unless formally defined, there exist numerous languages and settings where heuristics outperform optimal methods on data efficiency benchmarks. Finally, we prove that optimal methods are equal to the BlueFringe framework in terms of optimistic learning efficiency.