Person | TU Delft Repository

TestSpark: IntelliJ IDEA's Ultimate Test Generation Companion

Conference paper (2024) - Arkadii Sapozhnikov, Mitchell Olsthoorn, A. Panichella, V.V. Kovalenko, P. Derakhshanfar

Writing software tests is laborious and time-consuming. To address this, prior studies introduced various automated test-generation techniques. A well-explored research direction in this field is unit test generation, wherein artificial intelligence (AI) techniques create tests f ...

Data-Driven Software Engineering

Doctoral thesis (2021) - V.V. Kovalenko

Specialized tools, such as IDEs, issue trackers, and code review tools, are an indispensable part of the modern software engineering process. These tools are constantly evolving. Besides enabling tools to support a wider range of technologies and frameworks, we are learning to pr ...

Specialized tools, such as IDEs, issue trackers, and code review tools, are an indispensable part of the modern software engineering process. These tools are constantly evolving. Besides enabling tools to support a wider range of technologies and frameworks, we are learning to provide additional features in completely new ways. One prominent stream of innovation in software engineering tools is dedicated to utilizing historical data to enable data-driven features, such as defect prediction engines and recommender systems, which leverage records of prior activity to assist with decision making. Many data-driven features in software engineering tools initially get born out of the context of real-world tools as techniques devised and evaluated in synthetic settings by researchers. While convenient, synthetic evaluation of approaches that are ultimately aimed at bringing improvement to real world problems involves a number of simplifications and assumptions. In this dissertation, we highlight several aspects that, while vital for bringing innovative methods to software engineering tools, are often discarded in existing research. We closely explore several topics specific to artificial evaluation environments, such as simplifications in mining file modification histories, use of synthetic datasets for source code authorship attribution, and a gap between accuracy of reviewer recommendation models and their perception by users. Moreover, we make a case for sharing technical artifacts by converting data mining pipelines into reusable tools, and propose a novel approach to modeling expertise transfer from code modification by capturing individual contribution style of developers. Key contributions of this dissertation include a high-level model of the lifecycle of a data-driven software engineering technique, a discussion of dangerous assumptions and simplifications that are made on every step in this lifecycle, a demonstration of importance of a careful approach to mining software repositories, and a demonstration of serious misalignment between artificial evaluation and realistic environments for the problems of code reviewer recommendation and code authorship attribution. We conclude the dissertation by discussing underlying reasons for misalignment between research environments and real-world tools, and propose potential steps to narrow it down and ultimately accelerate innovation in software engineering tooling.@en

Pandemic programming: How COVID-19 affects software developers and how their organizations can help

Context: As a novel coronavirus swept the world in early 2020, thousands of software developers began working from home. Many did so on short notice, under difficult and stressful conditions.

Objective: This study investigates the effects of the pandemic on developers’ w ...

PathMiner

A library for mining of path-based representations of code

Conference paper (2019) - V.V. Kovalenko, Egor Bogomolov, Timofey Bryksin, A. Bacchelli

One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation - an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficien ...

Does reviewer recommendation help developers?

Journal article (2019) - V.V. Kovalenko, N. Tintarev, Evgeny Pasynkov, Christian Bird, A. Bacchelli

Selecting reviewers for code changes is a critical step for an efficient code review process. Recent studies propose automated reviewer recommendation algorithms to support developers in this task. However, the evaluation of recommendation algorithms, when done apart from thei ...

Mining File Histories

Should we consider branches?

Conference paper (2018) - V.V. Kovalenko, F. Palomba, A. Bacchelli

Modern distributed version control systems, such as Git, offer support for branching — the possibility to develop parts of software outside the master trunk. Consideration of the repository structure in Mining Software Repository (MSR) studies requires a thorough approach to mini ...

Code review for newcomers

Is it different?

Conference paper (2018) - V.V. Kovalenko, A. Bacchelli

Onboarding is a critical stage in the tenure of software developers with a project, because meaningful contribution requires familiarity with the codebase. Some software teams employ practices, such as mentoring, to help new developers get accustomed faster. Code review, i.e., th ...

Vladimir Kovalenko

Authored