M. Kechagia
Please Note
14 records found
1
Software evolution
The lifetime of fine-grained elements
A model regarding the lifetime of individual source code lines or tokens can estimate maintenance effort, guide preventive maintenance, and, more broadly, identify factors that can improve the efficiency of software development. We present methods and tools that allow tracking of each line’s or token’s birth and death. Through them, we analyze 3.3 billion source code element lifetime events in 89 revision control repositories. Statistical analysis shows that code lines are durable, with a median lifespan of about 2.4 years, and that young lines are more likely to be modified or deleted, following a Weibull distribution with the associated hazard rate decreasing over time. This behavior appears to be independent from specific characteristics of lines or tokens, as we could not determine factors that influence significantly their longevity across projects. The programing language, and developer tenure and experience were not found to be significantly correlated with line or token longevity, while project size and project age showed only a slight correlation.
The Exception Handling Riddle
An Empirical Study on the Android API
Interfaces (APIs), enhance modularity and reduce development time. Nevertheless, their use reinforces system dependency on third-party software. When libraries become obsolete or their APIs change, performing the necessary modifications to dependent systems, can be time-consuming, labour intensive and error-prone. In this paper, we propose a methodology that reduces the effort developers must spend to mitigate library obsolescence. We describe
the steps comprising the methodology, i.e., source code analysis, visualisation
of hot areas, code-based transformation, and verification of the modified system. Also, we present some preliminary results and describe our plan for developing a fully automated software modernisation approach. ...
Interfaces (APIs), enhance modularity and reduce development time. Nevertheless, their use reinforces system dependency on third-party software. When libraries become obsolete or their APIs change, performing the necessary modifications to dependent systems, can be time-consuming, labour intensive and error-prone. In this paper, we propose a methodology that reduces the effort developers must spend to mitigate library obsolescence. We describe
the steps comprising the methodology, i.e., source code analysis, visualisation
of hot areas, code-based transformation, and verification of the modified system. Also, we present some preliminary results and describe our plan for developing a fully automated software modernisation approach.
Goal: Measure the energy consumption and run-time performance of commonly used programming tasks implemented in different programming languages and executed on a variety of platforms to help developers to choose appropriate implementation platforms.
Method: Obtain measurements to calculate the Energy Delay Product, a weighted function that takes into account a task’s energy consumption and run-time performance. We perform our tests by calculating the Energy Delay Product of 25 programming tasks, found in the Rosetta Code Repository, which are implemented in 14 programming languages and run on three different computer platforms, a server, a laptop, and an embedded system.
Results: Compiled programming languages are outperforming the interpreted ones for most, but not for all tasks. C, C#, and JavaScript are on average the best performing compiled, semi-compiled, and interpreted programming languages for the Energy Delay Product, and Rust appears to be well-placed for i/o-intensive operations, such as file handling. We also find that a good behaviour, energywise, can be the result of clever optimizations and design choices in seemingly unexpected programming languages. ...
Goal: Measure the energy consumption and run-time performance of commonly used programming tasks implemented in different programming languages and executed on a variety of platforms to help developers to choose appropriate implementation platforms.
Method: Obtain measurements to calculate the Energy Delay Product, a weighted function that takes into account a task’s energy consumption and run-time performance. We perform our tests by calculating the Energy Delay Product of 25 programming tasks, found in the Rosetta Code Repository, which are implemented in 14 programming languages and run on three different computer platforms, a server, a laptop, and an embedded system.
Results: Compiled programming languages are outperforming the interpreted ones for most, but not for all tasks. C, C#, and JavaScript are on average the best performing compiled, semi-compiled, and interpreted programming languages for the Energy Delay Product, and Rust appears to be well-placed for i/o-intensive operations, such as file handling. We also find that a good behaviour, energywise, can be the result of clever optimizations and design choices in seemingly unexpected programming languages.
CodeFeedr's vision entails: (1) The ability to unify archival and current software analytics data under a single query language, and (2) The feasibility to apply new techniques and methods for high-level aggregation and summarization of near real-time information on software development. In this paper, we outline three use cases where our platform is expected to have a significant impact on the quality and speed of decision making; dependency management, productivity analytics, and run-time error feedback. ...
CodeFeedr's vision entails: (1) The ability to unify archival and current software analytics data under a single query language, and (2) The feasibility to apply new techniques and methods for high-level aggregation and summarization of near real-time information on software development. In this paper, we outline three use cases where our platform is expected to have a significant impact on the quality and speed of decision making; dependency management, productivity analytics, and run-time error feedback.
The role of exceptions is crucial for the robustness of modern applications and critical systems. Despite this, there is a long debate among researchers, programming language designers, and practitioners regarding the usefulness and appropriateness of the available exception types and their classification. In this paper, we examine Java exceptions and propose a new class hierarchy and compile-time mechanisms that take into account the context in which exceptions can arise. We believe that the increased specificity of exception handling based on our proposal can boost its effectiveness and lead to fewer application failures.
In this paper, we propose to configure at compiletime the checking associated with Application Programming Interfaces' methods that can receive possibly malformed values (e.g. erroneous user inputs and problematic retrieved recordsfrom databases) and thus cause application execution failures. To achieve this, we design a type system for implementing apluggable checker on the Java's compiler and find at compile timeinsufficient checking bugs that can lead to application crashesdue to malformed inputs. Our goal is to wrap methods whenthey receive external inputs so that the former generate checkedinstead of unchecked exceptions. We believe that our approachcan improve Java developers' productivity, by using exceptionhandling only when it is required, and ensure client applications'stability. We want to evaluate our checker by using it to verifythe source code of Java projects from the Apache ecosystem. Also, we want to analyze stack traces to validate the identifiedfailures by our checker.
Analyzing programming languages' energy consumption
An empirical study
The evolution of c programming practices
A study of the unix operating system 1973-2015
Tracking long-term progress in engineering and applied science allows us to take stock of things we have achieved, appreciate the factors that led to them, and set realistic goals for where we want to go. We formulate seven hypotheses associated with the long term evolution of C programming in the Unix operating system, and examine them by extracting, aggregating, and synthesising metrics from 66 snapshots obtained from a synthetic software configuration management repository covering a period of four decades. We found that over the years developers of the Unix operating system appear to have evolved their coding style in tandem with advancements in hardware technology, promoted modularity to tame rising complexity, adopted valuable new language features, allowed compilers to allocate registers on their behalf, and reached broad agreement regarding code formatting. The progress we have observed appears to be slowing or even reversing prompting the need for new sources of innovation to be discovered and followed.
Programs draw significant parts of their functionality through the use of Application Programming Interfaces (APIs). Apart from the way developers incorporate APIs in their software, the stability of these programs depends on the design and implementation of the APIs. In this work, we report how we used software telemetry data to analyze the causes of API failures in Android applications. Specifically, we got 4.9 gb worth of crash data that thousands of applications sent to a centralized crash report management service. We processed that data to extract approximately a million stack traces, stitching together parts of chained exceptions, and established heuristic rules to draw the border between applications and the API calls. We examined a set of more than a half million stack traces associated with risky API calls to map the space of the most common application failure reasons. Our findings show that the top ones can be attributed to memory exhaustion, race conditions or deadlocks, and missing or corrupt resources. Given the classes of the crash causes we identified, we recommend API design and implementation choices, such as specific exceptions, default resources, and non-blocking algorithms, that can eliminate common failures. In addition, we argue that development tools like memory analyzers, thread debuggers, and static analyzers can prevent crashes through early code testing and analysis. Finally, some execution platform and framework designs for process and memory management can also eliminate some application crashes.
Context: Numerous factors drive long term progress in programming practices. Goal: We study the evolution of C programming in the Unix operating system. Method: We extract, aggregate, and synthesize metrics from 66 snapshots obtained from an artificial software configuration management repository tracking the evolution of the Unix operating system over four decades. Results: C language programming practices appear to evolve over long term periods; our study identified some continuous trends with highly significant coefficients of determination. Many trends point toward increasing code quality through adherence to numerous programming guidelines, while some others indicate adoption that has reached maturity. In the area of commenting progress appears to have stalled. Conclusions: Studying the long term evolution of programming practices identifies areas where progress has been achieved along an expected path, as well as cases where there is room for improvement.
Undocumented and unchecked
Exceptions that spell trouble
Modern programs depend on apis to implement a significant part of their functionality. Apart from the way developers use apis to build their software, the stability of these programs relies on the apis design and implementation. In this work, we evaluate the reliability of apis, by examining software telemetry data, in the form of stack traces, coming from Android application crashes. We got 4.9 GB worth of crash data that thousands of applications send to a centralized crash report management service. We processed that data to extract approximately a million stack traces, stitching together parts of chained exceptions, and established heuristic rules to draw the border between applications and api calls. We examined 80% of the stack traces to map the space of the most common application failure reasons. Our findings show that the top ones can be attributed to memory exhaustion, race conditions or deadlocks, and missing or corrupt resources. At the same time, a significant number of our stack traces (over 10%) remains unclassified due to generic unchecked exceptions, which do not highlight the problems that lead to crashes. Finally, given the classes of crash causes we found, we argue that api design and implementation improvements, such as specific exceptions, non-blocking algorithms, and default resources, can eliminate common failures.