Change Point Detection In Continuous Integration Performance Tests

Master thesis (2021)

Authors

T.A.I. van der Horst Electrical Engineering, Mathematics and Computer Science

Contributors

K.G. Langendoen Embedded Systems - (mentor)

Maurício Aniche Software Engineering - (graduation committee member)

Remy Böhmer Philips Medical Systems Nederland NV (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Change detection Continuous monitoring Control chart Ensemble learning Multiple classifier systems Performance benchmarks

To reference this document use:

http://resolver.tudelft.nl/uuid:b9ef4b8e-a18e-40cb-b222-a4221cb22431

More Info

expand_more

Published Date

12-07-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Software testing is an integral part of the development of embedded systems. Among other reasons, tests are frequently used to ensure that a system meets all the specifications, which is especially important when designing systems for the medical industry. Software changes that have a detrimental impact on a real-time system's performance can accumulate until the systems no longer meet the required performance levels. At this point, fixing accumulated defects can become a costly and complex endeavor. The goal of this thesis is to create a method that can detect changes in performance metrics from a continuous integration pipeline with minimal manual intervention.

To achieve this goal, we have created a novel online, univariate change detection method, which consists of a diverse ensemble of more than 10 existing change detection algorithms. The ensemble is robust against changes in distribution and requires little tuning. The contributions of this work include the proposal of a generic architecture to combine both statistics and decisions from individual algorithms. Related work uses simple majority voting for decision fusion in the ensemble - we demonstrate that an ensemble can benefit from more complex decision fusion, for example using a Random Forest.

Synthetic data and a case study, using a dataset provided by Philips, are used to demonstrate that the overall ensemble is consistently able to outperform the individual algorithms in the ensemble. In addition, unlike the individual algorithms, the ensemble generalizes well to data that is not normally distributed and have not been encountered during training. Compared to the monitoring system for performance metrics that is currently being used by Philips, the ensemble is able to detect 75\% more changes with a 50\% lower false positive rate.

Files

MSc_Thesis.pdf

(pdf | 8.99 Mb)