Metric-based Evaluation of Implemented Software Architectures
More Info
expand_more
Abstract
Software systems make up an important part of our daily lives. Just like all man- made objects, the possibilities of a software system are constrained by the choices made during its creation. The complete set of these choices can be referred to as the software architecture of a system. Since the software architecture of a system has a large influence on what can, and cannot, be done with the system, it is important to regularly evaluate this architecture. The purpose of such an evaluation is to create an overview of the strengths and weaknesses of the software architecture, which can then be used to decide whether each weakness is accepted or should be addressed. There is a wide range of software architecture evaluation methods available which can be used to investigate one or more quality aspects of a software architecture. Most of these methods focus on the initial design of the software architecture, there are only a few which specifically target the implemented architecture. Looking at the design alone is not problematic if design and implementation are in sync, but unfortunately there are many occasions in which these two architectures deviate. Moreover, some systems are even built without an initial design at all. In addition, earlier research shows that software architectures are not regularly evaluated in practice, despite the availability of these methods. The reason for this is that the initial effort to start performing software architecture evaluations is considered to be too high for project teams. The goal of this thesis is to lower this initial investment by providing an overview of concrete evaluation attributes, as well as the definition of software metrics that can be used to evaluate these attributes. Our global research approach is that of “industry-as-a-lab”. During our research we have closely collaborated with the Software Improvement Group to design solutions, and to test these solutions on real-world cases. To be able to provide concrete advice we must focus on only a single quality attribute; in this thesis we choose to focus on the maintainability quality attribute of a software system. Following from our goal the research in this thesis is composed of two parts. The focus of the first part is on the identification of architectural attributes that can be used as an indicator for the maintainability of a software system. The result of mining over 40 evaluation reports, interviewing experts, and a validation with various experts is a list of 15 architectural attributes which experts consider to be indicators for the maintainability of a software system. To augment the opinion of the experts we used theories from the field of cognitive psychology to extend an existing model for architectural complexity. This extended model makes it possible to explain why each of the found attributes influence the maintainability of a software system. Based on the attributes and the model we developed a lightweight sanity-check for implemented architectures. This check consists of 28 questions and 28 actions divided over five categories. A person familiar with a system can use this check to get an initial overview of the status of their system within a day, and needs less effort to repeat this evaluation later on. In the second part of our research we focus on the design and validation of metrics related to two quality attributes: balance and independence. These two attributes are related to two of the major building blocks of an implemented architecture; the definition of the components of the system and the relationship these components have with each other. In the ideal case a system is decomposed into a limited set of components on the same level of abstraction, while the dependencies between these components is limited. To quantify the number of components and their level of abstraction we define the Component Balance metric. This metric takes the number of components of a system and the distribution of the volume of the system over these components and outputs a score between zero and one. Interviews with experts and a case-study show that the scores of this metric correlates with scores given by experts. The quantification of the dependencies between the components is done by a Dependency Profile. In such a profile, all code within a component is divided into one of four categories depending on whether a piece of code is dependent upon by, or depends on, code in other components. A large-scale experiment shows that the percentage of code in these categories is correlated with the ratio of local change within a system. Both metrics are implemented in practice to evaluate the usefulness of these two metrics within the context of the evaluation of implemented architectures. The results of a structured observation of experts using the metrics during a period of six months and interviews with 11 experts show that there is room for improvement, but that the two metrics are considered to be useful within this context. The combination of different research methods such as interviews, case-studies, empirical experiments and grounded theory, augmented by experiences taken from practice have lead to research results which are both valid and useful. In this thesis we lower the initial effort needed to start performing architectural evaluations by show- ing which concrete attributes should be taken into account, and how these attributes could be evaluated in a continuous manner. Additionally, we define and validate met- rics for two of these attributes, and show that experts find these metrics useful in the evaluation of implemented architectures within practice.