D. Spinellis | TU Delft Repository

Enhancing Issue Tracking Efficiency with AI-Driven Natural Language Processing: Improving Classification, Association and Resolution

Master thesis (2025) - V.A. Pocheva (author) , N. Yorke-Smith (mentor) , Maliheh Izadi (mentor) , René van den Berg (mentor) , Andreea Costea (graduation committee member) , Diomidis Spinellis (mentor)

In large-scale engineering environments, efficient issue tracking is essential for timely problem resolution and knowledge reuse. However, manual classification and association of issue reports present scalability challenges, further complicated by inconsistent annotations and th ...

What Secondary Issues Contribute to Operational Problems?

An Investigation Based on Public Postmortems

Bachelor thesis (2025) - A. Muresan (author) , Eileen Kapel (mentor) , DIomidis Spinellis (mentor) , Benedikt Ahrens (graduation committee member)

Operational incidents in software-defined systems can lead to significant disruptions, and while primary faults such as bugs or misconfigurations are well studied, secondary issues that exacerbate these failures remain underexplored. This research investigates what secondary issu ...

Understanding IT System Failures: Primary Fault Types, Severity Patterns, and Evolution in Modern Operations

An Analysis of Public Incident Reports Using Large Language Models

Bachelor thesis (2025) - J.A. Rutkowski (author) , Eileen Kapel (mentor) , DIomidis Spinellis (mentor) , Benedikt Ahrens (graduation committee member)

Modern businesses increasingly rely on software-driven operations, making system reliability a critical concern. Despite advances in automated operations, gaps remain in understanding how the primary causes of system failures manifest, impact operational severity, and evolve in c ...

Linking Software Changes to Incident Reports

Investigating Correlations Between Root Causes and the Mean Time To Repair of Incidents

Bachelor thesis (2025) - D.M. Bunschoten (author) , DIomidis Spinellis (mentor) , Eileen Kapel (mentor) , Benedikt Ahrens (graduation committee member)

The availability and reliability of online systems form the cornerstone of modern civilization. Companies actively try to minimize downtime during incidents, and publishing incident reports afterwards is a standard practice. However, what is missing is an overview of the distribu ...

Anatomy of a Fix: Analyzing Solution Patterns in Public IT Incident Reports

Insights from Postmortems on Mitigations and Fixes in Production Systems

Bachelor thesis (2025) - M. Georgiev (author) , Diomidis Spinellis (mentor) , Eileen Kapel (mentor) , Benedikt Ahrens (graduation committee member)

This study examined common remediation strategies by analysing publicly available IT incident reports. A six-category taxonomy (“Software Fix”, “Rollback”, “Traffic Switch”, “Hardware/Infrastructure Repair or Operation”, “Self-Resolved”, and “Undisclosed/Not Specified”) was devel ...

Understanding Software Failures Through Incident Report Analysis

An Empirical Study of 348 Incident Reports from the VOID

Bachelor thesis (2025) - I.M. Aldea (author) , DIomidis Spinellis (mentor) , Eileen Kapel (mentor) , Benedikt Ahrens (graduation committee member)

Software changes are a leading cause of operational failures in complex production systems. Despite the increasing use of Artificial Intelligence for Development Operations and the availability of postmortem data, research on software incidents remains fragmented and narrowly sco ...

Topic Classification of Publications

Identifying publication topics based on existing journals

Bachelor thesis (2024) - D. Lim (author) , DIomidis Spinellis (mentor) , G. Gousios (mentor) , Koen Langendoen (graduation committee member)

Accurate topic classification is crucial in the scientific community when it comes to finding relevant journals. However, the efficiency and accuracy of topic classification of publications do not seem to be at its best performance, especially with the fast-paced rise in the quan ...

Towards More Effective Querying of Medical Literature in Alexandria3K

How useful can Alexandria3K be for performing literature reviews

Bachelor thesis (2024) - S.J. Verlooy (author) , DIomidis Spinellis (mentor) , G. Gousios (mentor) , Koen Langendoen (graduation committee member)

The Alexandria3K library, a versatile Python-based tool, has been expanded to include the integra- tion of the PubMed dataset, enriching its capabil- ities in the analysis of scientific papers. Origi- nally supporting major datasets like Crossref and US patents, and smaller yet s ...

Use of LLMs to Improve Affiliation Disambiguation in Alexandria3k

Bachelor thesis (2024) - D.T. Gupta (author) , Diomidis Spinellis (mentor) , Georgios Gousios (mentor) , KG Langendoen (graduation committee member)

The growth of academic publications, heterogeneity of datasets and the absence of a globally accepted organization identifier introduce the challenge of affiliation disambiguation in bibliographic databases. In this paper, we create a baseline using the currently implemented algo ...

Automated Detection and Correction of Python Code Style Violations

An Empirical Study in Open Source Projects

Master thesis (2024) - R.K. Thakoersingh (author) , Diomidis Spinellis (mentor) , Pradeep K. Murukannaiah (graduation committee member)

This thesis investigates the prevalence of Pylint warnings in open-source Python projects and evaluates the effectiveness of an AI-driven tool for automatically fixing these warnings. The study also explores how developers perceive automated code suggestions and seeks to streamli ...

This thesis investigates the prevalence of Pylint warnings in open-source Python projects and evaluates the effectiveness of an AI-driven tool for automatically fixing these warnings. The study also explores how developers perceive automated code suggestions and seeks to streamline consent mechanisms for research-related code changes. The primary research questions addressed are: (1) What is the prevalence of Pylint warnings across open-source Python projects? (2) How effective is the AI tool developed to fix Pylint warnings? (3) How do developers perceive the automated suggestions and (4) how can the process of proposing research-related code changes with developer consent be streamlined?

To address these questions, the research draws on literature related to static code analysis, fault detection, and the increasing use of artificial intelligence (AI) in automated code repair. Previous studies highlight the challenges developers face in maintaining consistent code quality and the role of AI in automating such tasks.

The research follows a mixed-method approach. Quantitatively, a dataset of 205 open-source Python projects was analyzed to identify and address common Pylint warnings. An AI-driven tool was employed to attempt fixing these warnings, achieving a success rate of 88\%. In 60 projects, pull requests were submitted to open source maintainers to assess the effectiveness and reception of the tool. Qualitative feedback from maintainers was collected and analyzed, leading to a shift in the contribution strategy from pull requests to submitting issues first, as this was perceived as less intrusive and more manageable by developers.

The analysis revealed a high prevalence of Pylint warnings, particularly \textit{missing-function-docstring} and \textit{line-too-long}, across projects of all sizes. The AI-driven tool effectively fixed 88\% of the warnings, resulting in 70\% of the projects being fully warning-free. However, developer responses to automated pull requests were mixed, prompting the adoption of a more collaborative issue-first approach. These results suggest that AI tools can significantly improve code quality, but challenges remain to foster developer engagement and integrating such tools into established open source workflows.

The study has certain limitations, mainly the focus on Python projects, which may limit the generalizability of the findings to other languages or more complex projects. Furthermore, developer consent and participation were limited, which affected the full implementation of automated changes. Future research should focus on improving the integration of AI tools into developer workflows and expanding the scope of automated code fixes to more diverse and complex projects.

Author Name Disambiguation using Large Language Models

Contributions to a system for open reproducible publication research

Bachelor thesis (2024) - J. van Lieshout (author) , DIomidis Spinellis (mentor) , G. Gousios (mentor) , Koen Langendoen (graduation committee member)

Author name disambiguation, otherwise described as (publication) record linking, is a problem that has had considerable research dedicated to its solv- ing. Author attributions, calculating research met- rics and conducting literature reviews are amongst processes that experience ...

Predictive Test Case Selection and Prioritization at Adyen

Master thesis (2022) - A.H. Reurink (author) , Diomidis Spinellis (mentor) , S. Roos (mentor)

The set of regression and integration tests at many modern software companies is huge. It is difficult to run all tests after each code change, so the tests are often run for batches of code changes by different developers, late in the release cycle. This has ...

Analyzing the Criticality of NPM Packages Through a Time-Dependent Dependency Graph

Bachelor thesis (2022) - A.J.M. Brands (author) , G. Gousios (mentor) , D. Spinellis (mentor) , Avishek Anand (graduation committee member)

In (open-source) development, developers routinely rely on other libraries to improve their coding efficiency by reusing code. This reliance on other packages could cause issues when critical dependencies have suddenly have a vulnerability introduced to them. This work analyzes t ...

Analyzing the effect of introducing time as a component in Python dependency graphs

Bachelor thesis (2022) - A. Purcaru (author) , D. Spinellis (mentor) , G. Gousios (mentor) , Avishek Anand (graduation committee member)

The use of open-source packages is a common practice among developers. It decreases the development time and improves maintainability. But adding a dependency to a project comes with inherit risks such as introducing vulnerabilities. A few solutions that help visualize all of the ...

Analyzing the Criticality of Apache Maven Packages Through a Temporal Dependency Graph

Bachelor thesis (2022) - D. Corlade (author) , Georgios Gousios (mentor) , DIomidis Spinellis (mentor) , Avishek Anand (graduation committee member)

Developers rely on different software to improve their efficiency as to reuse parts of code and be able to maintain it with ease, which is why open source software libraries have gained much pop- ularity over the past years. This paper analyzes what are the most used packages fro ...

Using a Time Dependency Graph to find the most widely used Debian package

Bachelor thesis (2022) - T. Dobrev (author) , G. Gousios (mentor) , D. Spinellis (mentor) , Avishek Anand (graduation committee member)

The main principle of Open Source development is that developers can reuse different libraries over and over again to make their lives easier. That is why this practice has gained a lot of popularity. However, libraries usually depend on other already existing pieces of code. Thi ...

Finding most used software application by using a time-dependency graph

Bachelor thesis (2022) - A. Dumitru (author) , Georgios Gousios (mentor) , DIomidis Spinellis (mentor) , Avishek Anand (mentor)

Using open-source packages when developing software applications is the general practice among a vast amount of software developers. However, importing open-source code which may depend on other existing technologies may lead to the appearance of a transitive dependency chain. As ...

A study of bugs found in the Ansible configuration management system

Bachelor thesis (2022) - M. Rastenis (author) , Thodoris Sotiropoulos (mentor) , Diomidis Spinellis (mentor) , F. Broz (graduation committee member)

Research that focuses on examining software bugs is critical when developing tools for preventing and for fixing software issues. Previous work in this area has explored other types of systems, such as bugs of compilers and security issues stemming from open source systems hosted ...

A Study of Bugs Found in the Puppet Configuration Management System

Bachelor thesis (2022) - M. Krupauskas (author) , Diomidis Spinellis (mentor) , Thodoris Sotiropoulos (mentor) , F. Broz (graduation committee member)

This research studies the symptoms, root causes, impact, triggers, fixes, and system dependency of bugs in the Puppet configuration management system. Puppet is a widely used open-source configuration management system that performs various administrative tasks on machines based ...

Studying bugs in the Salt Configuration Management System

Bachelor thesis (2022) - B.Y. He (author) , Diomidis Spinellis (mentor) , Thodoris Sotiropoulos (mentor) , F. Broz (graduation committee member)

Configuration management systems are a class of software used to automate system administrative tasks, one of which is the configuration of software systems. Although the automation is less error-prone than manual configuration done by a human, bugs in the source code can still c ...