Print Email Facebook Twitter A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification Title A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification Author Panichella, A. (TU Delft Software Engineering) Contributor Nejati, Shiva (editor) Gay, Gregory (editor) Date 2019-01-01 Abstract Latent Dirichlet Allocation (LDA) has been used to support many software engineering tasks. Previous studies showed that default settings lead to sub-optimal topic modeling with a dramatic impact on the performance of such approaches in terms of precision and recall. For this reason, researchers used search algorithms (e.g., genetic algorithms) to automatically configure topic models in an unsupervised fashion. While previous work showed the ability of individual search algorithms in finding near-optimal configurations, it is not clear to what extent the choice of the meta-heuristic matters for SE tasks. In this paper, we present a systematic comparison of five different meta-heuristics to configure LDA in the context of duplicate bug reports identification. The results show that (1) no master algorithm outperforms the others for all software projects, (2) random search and PSO are the least effective meta-heuristics. Finally, the running time strongly depends on the computational complexity of LDA while the internal complexity of the search algorithms plays a negligible role. Subject Duplicate Bug ReportEvolutionary AlgorithmsLatent Dirichlet AllocationSearch-based Software EngineeringTopic modeling To reference this document use: http://resolver.tudelft.nl/uuid:a3603e0f-8d87-4735-9876-50cc7aebf1f8 DOI https://doi.org/10.1007/978-3-030-27455-9_2 Publisher Springer ISBN 9783030274542 Source Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings Event 11th International Symposium on Search-Based Software Engineering, SSBSE 2019, 2019-08-31 → 2019-09-01, Tallinn, Estonia Series Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 0302-9743, 11664 LNCS Part of collection Institutional Repository Document type conference paper Rights © 2019 A. Panichella Files PDF main.pdf 472.17 KB Close viewer /islandora/object/uuid:a3603e0f-8d87-4735-9876-50cc7aebf1f8/datastream/OBJ/view