Parameterizing and Assembling IR-Based Solutions for SE Tasks Using Genetic Algorithms

Conference Paper (2016)
Author(s)

A. Panichella (TU Delft - Software Engineering)

Bogdan Dit

Rocco Oliveto (University of Molise)

Massimiliano Di Penta (University of Sannio)

Denys Poshyvanyk (College of William and Mary)

Andrea De Lucia (University of Salerno)

Research Group
Software Engineering
DOI related publication
https://doi.org/10.1109/SANER.2016.97
More Info
expand_more
Publication Year
2016
Language
English
Research Group
Software Engineering
Pages (from-to)
314-325
ISBN (electronic)
978-1-5090-1855-0

Abstract

Information Retrieval (IR) approaches are nowadays used to support various software engineering tasks, such as feature location, traceability link recovery, clone detection, or refactoring. However, previous studies showed that inadequate instantiation of an IR technique and underlying process could significantly affect the performance of such approaches in terms of precision and recall. This paper proposes the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process for software engineering tasks. The approach (named GA-IR) determines the (near) optimal solution to be used for each stage of the IR process, i.e., term extraction, stop word removal, stemming, indexing and an IR algebraic method calibration. We applied GA-IR on two different software engineering tasks, namely traceability link recovery and identification of duplicate bug reports. The results of the study indicate that GA-IR outperforms approaches previously published in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised and combinatorial approach.

No files available

Metadata only record. There are no files for this record.