Online state migration in modern stream processing engines

More Info
expand_more

Abstract

Stream Processing Engines (SPEs) are called upon to help solve problems around big and volatile data, while satisfying the needs for near real-time processing. In order for such systems to be considered effective solutions to such problems at scale, efficient elasticity and non dataflow-disturbing reconfiguration operations within are a necessity. To that end, we visit the problem of online state migration, as the biggest obstacle in achieving such a desired behaviour, in SPEs that support stateful functions. We make an attempt to formally define the problem and associated sub-tasks, compare existing solutions and identify key aspects, as well as design and implement our own solution. Our testing shows that the lazy-fetch online state migration process proposed, outperforms a simple baseline state migration design by orders of magnitude in end-to-end latency observed, scales much better under increased workloads and relies on consistent design concepts to claim exactly-once semantics.

Files

License info not available