Anatomy of a Fix: Analyzing Solution Patterns in Public IT Incident Reports

None, None

Anatomy of a Fix: Analyzing Solution Patterns in Public IT Incident Reports

Insights from Postmortems on Mitigations and Fixes in Production Systems

Bachelor Thesis (2025)

Author(s)

M. Georgiev (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Diomidis Spinellis – Mentor (TU Delft - Software Engineering)

Eileen Kapel – Mentor

Benedikt Ahrens – Graduation committee member (TU Delft - Programming Languages)

Faculty

Electrical Engineering, Mathematics and Computer Science

Incident management AIOps IT operations Postmortem analysis

To reference this document use:

https://resolver.tudelft.nl/uuid:7f89bb7c-5264-4f2c-a2ad-fdcfd75df17f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

25-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study examined common remediation strategies by analysing publicly available IT incident reports. A six-category taxonomy (“Software Fix”, “Rollback”, “Traffic Switch”, “Hardware/Infrastructure Repair or Operation”, “Self-Resolved”, and “Undisclosed/Not Specified”) was developed to classify implemented solutions. Subsequently, a corpus of 1268 recent public incident reports sourced from the VOID community database was collected, from which the solution description of each report is classified utilising a promptbased approach with the LLaMA3.3-70B-Versatile large language model (LLM). The LLM classifier
demonstrated substantial agreement (with Cohen’s κ = 71.4%, Macro F1 = 80.6%) with manual annotations on a ground truth subset of 127 reports. The primary findings revealed that a significant majority (76%) of the reports do not disclose specific technical solutions. Among reports with identifiable fixes, software fixes (5.5% of total) were the most common. Exploratory analysis also showed a statistically significant but small relationship between solution category and incident duration. This research highlights the utility of LLMs for analysing incident reports and powering AIOps and underscores the need for improvement in incident reporting.

Files

Reasearch_Project_Martin_G_RR.... (pdf)

(pdf | 0.27 Mb)

License info not available