Scotty

Efficient window aggregation for out-of-order stream processing

Conference Paper (2018)
Author(s)

Jonas Traub (Technical University of Berlin)

Philipp Marian Grulich (DFKI GmbH)

Alejandro Rodriguez Cuellar (Technical University of Berlin)

Sebastian Bress (DFKI GmbH, Technical University of Berlin)

Asterios Katsifodimos (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Tilmann Rabl (DFKI GmbH, Technical University of Berlin)

Volker Markl (Technical University of Berlin, DFKI GmbH)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1109/ICDE.2018.00135 Final published version
More Info
expand_more
Publication Year
2018
Language
English
Research Group
Web Information Systems
Article number
8509356
Pages (from-to)
1304-1307
ISBN (electronic)
9781538655207
Event
34th IEEE International Conference on Data Engineering, ICDE 2018 (2018-04-16 - 2018-04-19), Paris, France
Downloads counter
417
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Computing aggregates over windows is at the core of virtually every stream processing job. Typical stream processing applications involve overlapping windows and, therefore, cause redundant computations. Several techniques prevent this redundancy by sharing partial aggregates among windows. However, these techniques do not support out-of-order processing and session windows. Out-of-order processing is a key requirement to deal with delayed tuples in case of source failures such as temporary sensor outages. Session windows are widely used to separate different periods of user activity from each other. In this paper, we present Scotty, a high throughput operator for window discretization and aggregation. Scotty splits streams into non-overlapping slices and computes partial aggregates per slice. These partial aggregates are shared among all concurrent queries with arbitrary combinations of tumbling, sliding, and session windows. Scotty introduces the first slicing technique which (1) enables stream slicing for session windows in addition to tumbling and sliding windows and (2) processes out-of-order tuples efficiently. Our technique is generally applicable to a broad group of dataflow systems which use a unified batch and stream processing model. Our experiments show that we achieve a throughput an order of magnitude higher than alternative state-of-The-Art solutions.

Files

Icde18_scotty.pdf
(pdf | 0.422 Mb)
License info not available