DFA Learning: Minimal Models from Subsamples vs. Heuristic Models from Full Data
M. Hristodorescu (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Sicco Verwer – Mentor (TU Delft - Algorithmics)
Simon Dieck – Mentor (TU Delft - Algorithmics)
Soham Chakraborty – Graduation committee member (TU Delft - Programming Languages)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Learning deterministic finite automata (DFAs) from labeled traces is a key problem with applications in software analysis and system modeling. SAT-based methods are effective but can be slow when dealing with large datasets. To address this, we propose a sampling method that selects a smaller, but still representative set of traces. Our approach groups traces with similar suffixes and uses edit distance to choose diverse examples. The proposed sampling performs better than random uniform sampling and significantly better than heuristic algorithms.