Software evolution

the lifetime of fine-grained elements

Journal Article (2021)
Author(s)

Diomidis Spinellis (Athens University of Economics and Business, TU Delft - Software Engineering)

Panos Louridas (Athens University of Economics and Business)

Maria Kechagia (University College London)

Research Group
Software Engineering
Copyright
© 2021 D. Spinellis, Panos Louridas, M. Kechagia
DOI related publication
https://doi.org/10.7717/PEERJ-CS.372
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 D. Spinellis, Panos Louridas, M. Kechagia
Research Group
Software Engineering
Volume number
7
Pages (from-to)
1-33
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A model regarding the lifetime of individual source code lines or tokens can estimate maintenance effort, guide preventive maintenance, and, more broadly, identify factors that can improve the efficiency of software development. We present methods and tools that allow tracking of each line’s or token’s birth and death. Through them, we analyze 3.3 billion source code element lifetime events in 89 revision control repositories. Statistical analysis shows that code lines are durable, with a median lifespan of about 2.4 years, and that young lines are more likely to be modified or deleted, following a Weibull distribution with the associated hazard rate decreasing over time. This behavior appears to be independent from specific characteristics of lines or tokens, as we could not determine factors that influence significantly their longevity across projects. The programing language, and developer tenure and experience were not found to be significantly correlated with line or token longevity, while project size and project age showed only a slight correlation.