HR

H. Radu

info

Please Note

1 records found

Bachelor thesis (2025) - H. Radu, S.E. Verwer, S. Dieck, S.S. Chakraborty
Deterministic finite automata (DFA) are interpretable models used for classification and prediction tasks based on sequence data. They often act as surrogate models for software systems. Plenty of methods exist for the purpose of DFA learning. Examples include optimal algorithms such as SAT-based encoding and various heuristic methods as the likes of the BlueFringe framework of the EDSM algorithm. By definition, optimal algorithms can guarantee a minimal DFA consistent with the training data, but this does not exclude the possibility of heuristics also finding the optimal solution. However, it is generally believed that optimal methods could always require strictly less data to learn such a minimal model than their counterparts. In our research, we provide mathematical proofs and counter-examples that show the above statement to be false. We further demonstrate that, unless formally defined, there exist numerous languages and settings where heuristics outperform optimal methods on data efficiency benchmarks. Finally, we prove that optimal methods are equal to the BlueFringe framework in terms of optimistic learning efficiency. ...