Online state migration in modern stream processing engines
Theodoros Veneti (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Katsifodimos – Mentor (TU Delft - Web Information Systems)
J.E.A.P. Decouchant – Graduation committee member (TU Delft - Data-Intensive Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Stream Processing Engines (SPEs) are called upon to help solve problems around big and volatile data, while satisfying the needs for near real-time processing. In order for such systems to be considered effective solutions to such problems at scale, efficient elasticity and non dataflow-disturbing reconfiguration operations within are a necessity. To that end, we visit the problem of online state migration, as the biggest obstacle in achieving such a desired behaviour, in SPEs that support stateful functions. We make an attempt to formally define the problem and associated sub-tasks, compare existing solutions and identify key aspects, as well as design and implement our own solution. Our testing shows that the lazy-fetch online state migration process proposed, outperforms a simple baseline state migration design by orders of magnitude in end-to-end latency observed, scales much better under increased workloads and relies on consistent design concepts to claim exactly-once semantics.