Circular Image

D. Spinellis

149 records found

Science typically advances in small incremental steps, but in some rare instances it leaps forward. One discovery or invention can change how we see the world around us. Would it not be neat to be able to accurately pinpoint those moments of time in an objective way and thereby i ...
IN CONTRAST TO physical objects and living things, software doesn’t deteriorate with the passage of time. While we age and our shoes fall apart, digital storage ensures that the software’s bits stay immutable. And yet, software needs substantial maintenance over time, owing to ch ...
The C preprocessor, a key element of the language, has become a liability due to its lack of integration with modern language semantics. This column describes the analysis of the C preprocessor usage in the Linux kernel, comprising 20 million lines of code, using the CScout refac ...
Generative AI based on large-language models is significantly impacting software development through IDE assistants, cloud-based APIs, and interactive chatbots for coding assistance. It excels in generating and translating code and data, navigating APIs, and creating boilerplate ...
Effective data processing workflows are crucial in data science, business analytics, and machine learning. Domain-specific tools can be invaluable, but often custom workflows are needed. Key to their success is splitting data and tasks into manageable chunks to enhance reliabilit ...
Effective change management is crucial for businesses heavily reliant on software and services to minimise incidents induced by changes. Unfortunately, in practice it is often difficult to effectively use artificial intelligence for IT Operations (AIOps) to enhance service manage ...
RDBUnit is a unit testing framework designed to test relational database queries, created out of a need for unit testing them while working on software analytics tasks. It is available as a Python package on PyPI and open-source software on GitHub. RDBUnit tests consist of three ...
Code refactoring is an essential part of software development, because it reduces technical debt, enhances long-term code sustainability, and enables the implementation of functionality that might have been incompatible with an original design. IDEs automate many refactoring task ...
Machine learning (ML) techniques increase the effectiveness of software engineering (SE) lifecycle activities. We systematically collected, quality-assessed, summarized, and categorized 83 reviews in ML for SE published between 2009 and 2022, covering 6,117 primary studies. The S ...
Considerable scientific work involves locating, analyzing, systematizing, and synthesizing other publications, often with the help of online scientific publication databases and search engines. However, use of online sources suffers from a lack of repeatability and transparency, ...
Developers and data scientists often struggle to write command-line inputs, even though graphical interfaces or tools like ChatGPT can assist. The solution? "ai-cli,"an open-source system inspired by GitHub Copilot that converts natural language prompts into executable commands f ...
Existing work on the practical impact of software engineering (SE) research examines industrial relevance rather than adoption of study results, hence the question of how results have been practically applied remains open. To answer this and investigate the outcomes of impactful ...
We propose a testing framework for validating static typing procedures in compilers. Our core component is a program generator suitably crafted for producing programs that are likely to trigger typing compiler bugs. One of our main contributions is that our program generator give ...
Context: Software development projects increasingly adopt unit testing as a way to identify and correct program faults early in the construction process. Code that is unit tested should therefore have fewer failures associated with it. Objective: Compare the number of field failu ...
Context: An excessive number of code smells make a software system hard to evolve and maintain. Machine learning methods, in addition to metric-based and heuristic-based methods, have been recently applied to detect code smells; however, current methods are considered far from ma ...

A Replication Package for PyCG

Practical Call Graph Generation in Python

The ICSE 2021 paper titled 'PyCG: Practical Call Graph Generation in Python' comes with a replication package with the purpose of providing open access to (1) our prototype call graph generator, namely PyCG, and (2) the data and scripts that replicate the results of the paper. Th ...
We introduce, what is to the best of our knowledge, the first approach for systematically testing Object-Relational Mapping (ORM) systems. Our approach leverages differential testing to establish a test oracle for ORM-specific bugs. Specifically, we first generate random relation ...

Software evolution

The lifetime of fine-grained elements

A model regarding the lifetime of individual source code lines or tokens can estimate maintenance effort, guide preventive maintenance, and, more broadly, identify factors that can improve the efficiency of software development. We present methods and tools that allow tracking of ...

Acquiring developer-prized practical skills, knowledge, and experiences.@en