Software changes are a leading cause of operational failures in complex production systems. Despite the increasing use of Artificial Intelligence for Development Operations and the availability of postmortem data, research on software incidents remains fragmented and narrowly sco
...
Software changes are a leading cause of operational failures in complex production systems. Despite the increasing use of Artificial Intelligence for Development Operations and the availability of postmortem data, research on software incidents remains fragmented and narrowly scoped. This study aims to provide a generalizable understanding of software and change-induced incidents through structured analysis of 348 real-world incident reports from the Verica Open Incident Database. Using few-shot prompting with the GPT-4.1 Mini model, we extract key incident characteristics (root cause, triggering change, impact, severity, and remediation) and apply clustering to identify recurring incident archetypes. Our method achieves over 80% annotation accuracy on a manually labeled subset. We find that over half of incidents stem from software changes, with deployments and configuration updates disproportionately associated with high severity and manual remediation. Capacity issues and code defects are leading root causes. Clustering uncovers several prominent archetypes, including capacity-driven outages, defect-induced degradations, and hybrid failures involving improper changes. These findings support scalable incident analysis and can inform more context-aware operational strategies.