Identifying Informative System Metrics: Predicting the predictability of time series using entropy

Master thesis (2021)

Authors

P.S. Patil Electrical Engineering, Mathematics and Computer Science

Contributors

A. van Deursen Software Technology (supervisor 1)

G. Gousios Software Engineering - (supervisor 2)

Jan S. Rellermeyer Data-Intensive Systems - (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:7a655689-9c0d-4d5d-aab3-365ebfea45b1

Published Date

16-03-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Building predictive models using cloud metrics for a task like incident prediction in the cloud is becoming ubiquitous in cloud monitoring. For such a forecasting task, if we know beforehand which system metrics are predictable then we can easily build good models. Quantifying the predictability of cloud metrics can help us rank the available system metrics and select a subset of cloud metrics with the lowest complexity. Moreover, storing informative metrics for a longer period can result in better forecasting. This thesis presents a novel entropy method for quantifying the complexity of time series: Reverse weighted Dispersion Entropy (RWDE). We also present an exploratory study to understand and quantify the complexity of cloud metrics. This exploratory case study has been carried out at ING, a large banking company with in-house cloud architecture. We perform simulation experiments on simulated signals to compare RWDE with other entropy methods. We apply RWDE on cloud metric data from ING to approximate the predictability of these cloud metrics. The experimental results show that RWDE has better performance than other entropy methods and can be used to select informative cloud metrics for a forecasting task. Further, we establish a relationship between RWDE and model-based predictability of cloud metrics. For each cloud metric, we compare RWDE with predictions from various forecasting models. Our results show that this relationship can be used as a heuristic by practitioners to identify unsuitable forecasting models for certain cloud metrics. We make RWDE and other entropy methods discussed in this study available as an open-source Python package.

Files

Msc_ThesisFinal_pradyot.pdf

(.pdf | 1.4 Mb)