- document
-
Zhang, Xunhui (author), Yu, Yue (author), Gousios, G. (author), Rastogi, A. (author)Context: The pull-based development model is widely used in open source projects, leading to the emergence of trends in distributed software development. One aspect that has garnered significant attention concerning pull request decisions is the identification of explanatory factors. Objective: This study builds on a decade of research on...journal article 2023
- document
-
Al-Kaswan, A. (author), Ahmed, Toufique (author), Izadi, M. (author), Sawant, Anand Ashok (author), Devanbu, Premkumar (author), van Deursen, A. (author)Binary reverse engineering is used to understand and analyse programs for which the source code is unavailable. Decompilers can help, transforming opaque binaries into a more readable source code-like representation. Still, reverse engineering is difficult and costly, involving considering effort in labelling code with helpful summaries....conference paper 2023
- document
-
van Dam, Tim (author), Izadi, M. (author), van Deursen, A. (author)Transformer-based pre-trained models have recently achieved great results in solving many software engineering tasks including automatic code completion which is a staple in a developer’s toolkit. While many have striven to improve the code-understanding abilities of such models, the opposite – making the code easier to understand – has not been...conference paper 2023
- document
-
van Deursen, A. (author)Eelco Visser (1966–2022) was a leading member of the department of Software Technology (ST) of the faculty of Electrical Engineering Mathematics, and Computer Science (EEMCS) of Delft University of Technology. He had a profound influence on the educational programs in computer science at TU Delft, built a highly successful Programming...conference paper 2023
- document
-
Al-Kaswan, A. (author), Izadi, M. (author), van Deursen, A. (author)Previous work has shown that Large Language Models are susceptible to so-called data extraction attacks. This allows an attacker to extract a sample that was contained in the training data, which has massive privacy implications. The construction of data extraction attacks is challenging, current attacks are quite inefficient, and there exists a...other 2023
- document
-
Altmeyer, P. (author), Giovan, Angela (author), Buszydlik, Aleksander (author), Dobiczek, Karol (author), van Deursen, A. (author), Liem, C.C.S. (author)Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely focused on single individuals in a static environment: given some estimated model, the goal is to find valid counterfactuals for an individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and...conference paper 2023
- document
-
Altmeyer, P. (author), Liem, C.C.S. (author), van Deursen, A. (author)We present CounterfactualExplanations.jl: a package for generating Counterfactual Explanations (CE) and Algorithmic Recourse (AR) for black-box models in Julia. CE explain how inputs into a model need to change to yield specific model predictions. Explanations that involve realistic and actionable changes can be used to provide AR: a set of...conference paper 2023
- document
-
Derakhshanfar, P. (author), Devroey, Xavier (author), Panichella, A. (author), Zaidman, A.E. (author), van Deursen, A. (author)Search-based approaches have been used in the literature to automate the process of creating unit test cases. However, related work has shown that generated tests with high code coverage could be ineffective, i.e., they may not detect all faults or kill all injected mutants. In this paper, we propose Cling, an integration-level test case...journal article 2023
- document
-
Yarally, Tim (author), Cruz, Luis (author), Feitosa, Daniel (author), Sallou, J. (author), van Deursen, A. (author)Modern AI practices all strive towards the same goal: better results. In the context of deep learning, the term "results"often refers to the achieved accuracy on a competitive problem set. In this paper, we adopt an idea from the emerging field of Green AI to consider energy consumption as a metric of equal importance to accuracy and to...conference paper 2023
- document
-
Poenaru-Olaru, L. (author), Cruz, Luis (author), Rellermeyer, Jan S. (author), van Deursen, A. (author)AIOps solutions enable faster discovery of failures in operational large-scale systems through machine learning models trained on operation data. These models become outdated during the occurrence of concept drift, a term used to describe shifts in data distributions. In operation data concept drift is inevitable and it impacts the...conference paper 2023
- document
-
Shome, A. (author), Cruz, Luis (author), van Deursen, A. (author)Visualisations drive all aspects of the Machine Learning (ML) Development Cycle but remain a vastly untapped resource by the research community. ML testing is a highly interactive and cognitive process which demands a human-in-the-loop approach. Besides writing tests for the code base, bulk of the evaluation requires application of domain...conference paper 2023
- document
-
Siachamis, G. (author), Kanis, Job (author), Koper, Wybe (author), Psarakis, K. (author), Fragkoulis, M. (author), van Deursen, A. (author), Katsifodimos, A (author)In this work, we evaluate autoscaling solutions for stream processing engines. Although autoscaling has become a mainstream subject of research in the last decade, the database research community has yet to evaluate different autoscaling techniques under a proper benchmarking setting and evaluation framework. As a result, every newly proposed...conference paper 2023
- document
-
Poenaru-Olaru, L. (author), Sallou, J. (author), Cruz, Luis (author), Rellermeyer, Jan S. (author), van Deursen, A. (author)Deployed machine learning systems often suffer from accuracy degradation over time generated by constant data shifts, also known as concept drift. Therefore, these systems require regular maintenance, in which the machine learning model needs to be adapted to concept drift. The literature presents plenty of model adaptation techniques. The...conference paper 2023
- document
-
Siachamis, G. (author), Psarakis, K. (author), Fragkoulis, M. (author), Papapetrou, Odysseas (author), van Deursen, A. (author), Katsifodimos, A (author)How can we perform similarity joins of multi-dimensional streams in a distributed fashion, achieving low latency? Can we adaptively repartition those streams in order to retain high performance under concept drifts? Current approaches to similarity joins are either restricted to single-node deployments or focus on set-similarity joins, failing...conference paper 2023
- document
-
Kula, E. (author), Greuter, Eric (author), van Deursen, A. (author), Gousios, G. (author)Late delivery of software projects and cost overruns have been common problems in the software industry for decades. Both problems are manifestations of deficiencies in effort estimation during project planning. With software projects being complex socio-technical systems, a large pool of factors can affect effort estimation and on-time...journal article 2022
- document
-
Gissurarson, Matthías Páll (author), Applis, L.H. (author), Panichella, A. (author), van Deursen, A. (author), Sands, David (author)Automatic program repair (APR) regularly faces the challenge of overfitting patches — patches that pass the test suite, but do not actually address the problems when evaluated manually. Currently, overfit detection requires manual inspection or an oracle making quality control of APR an expensive task. With this work, we want to introduce...conference paper 2022
- document
-
Maddila, C.S. (author), Nagappan, Nachiappan (author), Bird, Christian (author), Gousios, G. (author), van Deursen, A. (author)Modern, complex software systems are being continuously extended and adjusted. The developers responsible for this may come from different teams or organizations, and may be distributed over the world. This may make it difficult to keep track of what other developers are doing, which may result in multiple developers concurrently editing the...journal article 2022
- document
-
Hejderup, J.I. (author), Beller, M.M. (author), Triantafyllou, K. (author), Gousios, G. (author)Modern programming languages such as Java, JavaScript, and Rust encourage software reuse by hosting diverse and fast-growing repositories of highly interdependent packages (i.e., reusable libraries) for their users. The standard way to study the interdependence between software packages is to infer a package dependency network by parsing...journal article 2022
- document
-
Anderson, K.S. (author), Visser, Denise (author), Mannen, Jan-Willem (author), Jiang, Yuxiang (author), van Deursen, A. (author)Background: Applying Continuous Experimentation on a large scale is not easily achieved. Although the evolution within large tech organisations is well understood, we still lack a good understanding of how to transition a company towards applying more experiments. Objective: This study investigates how practitioners define, value and apply...conference paper 2022
- document
-
Mir, S.A.M. (author), Latoskinas, Evaldas (author), Proksch, S. (author), Gousios, G. (author)Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility and productivity. Lack of static typing can cause run-time exceptions and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing code-bases is...conference paper 2022