Spatial-Temporal Prediction Models for Active Ticket Managing in Data Centers

Journal Article (2018)
Author(s)

Ji Xue (College of William and Mary)

Robert Birke (Google LLC)

Y. Chen (Zurich Lab)

Evgenia Smirni (College of William and Mary)

Affiliation
External organisation
DOI related publication
https://doi.org/10.1109/TNSM.2018.2794409
More Info
expand_more
Publication Year
2018
Language
English
Affiliation
External organisation
Issue number
1
Volume number
15
Pages (from-to)
39-52

Abstract

Performance ticket handling is an expensive operation in data centers, where physical boxes host multiple virtual machines (VMs). A large body of tickets arise from resource usage warnings, e.g., CPU and RAM usages that exceed predefined thresholds. The transient nature of CPU and RAM usage as well as their strong correlation across time among co-located VMs within boxes drastically increase the complexity of ticket management. Based on large resource usage data collected from production data centers, with 6K physical boxes and more than 80K VMs, we first discover patterns of spatial and temporal dependencies among/within the usage series of co-located resources. Leveraging our key findings, we develop an active ticket managing (ATM) system that aims to drastically reduce usage tickets. ATM consists of: 1) a spatial-Temporal dependency-based time series prediction methodology and 2) a proactive capacity planning policy for CPU and RAM resources for VMs co-located within a box and boxes within a single data center client, that aims to drastically reduce usage tickets. ATM exploits the spatial-Temporal dependency across/within multiple resources of co-located VMs and single-client boxes for usage prediction, and then actuates proactive capacity planning. Evaluation results on traces of 6K physical boxes from operating data centers show that ATM is able to provide accurate prediction of usage series in cloud data centers with low computational overhead. At the same time ATM achieves significant ticket reduction up to 60% for both VM and box usage series.

No files available

Metadata only record. There are no files for this record.