Efficient Window Aggregation with General Stream Slicing
Jonas Traub (Technical University of Berlin)
Philipp Grulich (DFKI GmbH)
Alejandro Rodríguez Cuéllar (Technical University of Berlin)
Sebastian Breß (DFKI GmbH, Technical University of Berlin)
Asterios Katsifodimos (TU Delft - Web Information Systems)
Tilmann Rabl (DFKI GmbH, Technical University of Berlin)
Volker Markl (Technical University of Berlin, DFKI GmbH)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Window aggregation is a core operation in data stream processing. Existing aggregation techniques focus on reducing latency, eliminating redundant computations, and minimizing memory usage. However, each technique operates under different assumptions with respect to workload characteristics such as properties of aggregation functions (e.g., invertible, associative), window types (e.g., sliding, sessions), windowing measures (e.g., time- or count-based), and stream (dis)order. Violating the assumptions of a technique can deem it unusable or drastically reduce its performance. In this paper, we present the first general stream slicing technique for window aggregation. General stream slicing automatically adapts to workload characteristics to improve performance without sacrificing its general applicability. As a prerequisite, we identify workload characteristics which affect the performance and applicability of aggregation techniques. Our experiments show that general stream slicing outperforms alternative concepts by up to one order of magnitude.