Automatic Identification of Key Classes in a Software System Using Webmining Techniques

More Info
expand_more

Abstract

Preprint of article published in: Journal of Software Maintenance and Evolution: Research and Practice (Wiley), 20 (6), 2008; doi:10.1002/smr.370 Software engineers new to a project are often stuck sorting through hundreds of classes in order to find those few classes that offer a significant insight into the inner workings of the software project. To help stimulate this process, we propose a technique that can identify the most important classes in a system or the key classes of that system. Software engineers can use these classes to focus their understanding efforts when starting to work on a new software project. Those key classes are typically characterized with having a lot of ‘control’ within the application. In order to find these controlling classes, we present a detection approach that is based on dynamic coupling and webmining. We demonstrate the potential of our technique using two open-source software systems that have a rich documentation set. During the case studies we use dynamically gathered coupling information that vary between a number of coupling metrics. The case studies show that we are able to retrieve 90% of the classes deemed important by the original maintainers of the systems, while maintaining a level of precision of around 50%.