An experimental evaluation of auto-scaling techniques for distributed stream processing systems

More Info
expand_more

Abstract

The introduction of cloud hosting has made it possible to elastically provision distributed stream processing systems (SPEs). By dynamically scaling the different operators of the system, resource consumption can be minimised while meeting the system service-level objectives. In the literature, many different auto-scaling techniques are proposed that make scaling decisions based on the current state of the system. However, these techniques are poorly evaluated and are rarely compared with each other. This makes it difficult to determine the state-of-the-art for auto-scaling techniques targeting SPEs, which slows down its development. In this paper, we design and implement a modular framework to evaluate the performance of state-of-the-art auto-scalers targeting SPEs. We implement state-of-the-art auto-scalers Dhalion, DS2, and Varga et al., using Kubernetes horizontal pod auto-scaler as baseline. We perform an end-to-end experimental evaluation of the auto-scalers and investigate their performance when run on different queries and workload patterns. Furthermore, we investigate the convergence time of the auto-scalers and evaluate their scaling accuracy. The results emphasise the difficulty of capturing the complex relationships of different operators and the struggle to balance resource efficiency and the performance of the system. Moreover, it shows the inherent weakness of reactive auto-scalers to react slowly to changing workloads and reveals the importance of considering the current health of the system when issuing scaling actions.