Global State Queries in Stream Processing
M.S. Patil (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A Katsifodimos – Mentor (TU Delft - Data-Intensive Systems)
Alexios Voulimeneas – Graduation committee member (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
While database systems have matured significantly over the past few decades, the rapid growth of real-time analytics to feed quick decision making has paved a way for multipurpose and high performant systems. As stream processing also matures, it is of interest to explore its full functional capabilities such as state management. Most streaming systems have inaccessible state for external systems to query, which limits the ability to drive value from the live mutable state data. In this thesis we present Q-Styx, a system that exposes the live state of stateful operators in a streaming engine for external queries. We introduce a global state store that maintains a copy of the distributed state across the system without the need of an external database. With strong isolation guarantees for consistent results, our implementation balances the tradeoffs between performance isolation and data freshness while exhibiting minimal impact on the core transactional capabilities of the streaming engine.