EK

E. Kapel

4 records found

What Secondary Issues Contribute to Operational Problems?

An Investigation Based on Public Postmortems

Operational incidents in software-defined systems can lead to significant disruptions, and while primary faults such as bugs or misconfigurations are well studied, secondary issues that exacerbate these failures remain underexplored. This research investigates what secondary issu ...

Understanding Software Failures Through Incident Report Analysis

An Empirical Study of 348 Incident Reports from the VOID

Software changes are a leading cause of operational failures in complex production systems. Despite the increasing use of Artificial Intelligence for Development Operations and the availability of postmortem data, research on software incidents remains fragmented and narrowly sco ...

Linking Software Changes to Incident Reports

Investigating Correlations Between Root Causes and the Mean Time To Repair of Incidents

The availability and reliability of online systems form the cornerstone of modern civilization. Companies actively try to minimize downtime during incidents, and publishing incident reports afterwards is a standard practice. However, what is missing is an overview of the distribu ...
Modern businesses increasingly rely on software-driven operations, making system reliability a critical concern. Despite advances in automated operations, gaps remain in understanding how the primary causes of system failures manifest, impact operational severity, and evolve in c ...