Global-State Querying in Stream Processing using Snapshots

Master Thesis (2025)
Author(s)

S.S. Kshirsagar (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A Katsifodimos – Mentor (TU Delft - Data-Intensive Systems)

K. Psarakis – Mentor (TU Delft - Data-Intensive Systems)

G.C. Christodoulou – Mentor (TU Delft - Data-Intensive Systems)

George Iosifidis – Graduation committee member (TU Delft - Networked Systems)

Burcu Kulahcioglu Ozkan – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
17-07-2025
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Data Science and Technology']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Stateful Functions-as-a-Service (SFaaS) platforms, such as Styx, are emerging as powerful abstractions for building distributed, serverless cloud applications. By combining the abilities of FaaS with strong transactional guarantees, they enable complex, stateful workflows without requiring developers to manage infrastructure. However, they lack built-in support for analytical queries across distributed function state. This thesis addresses that gap by proposing H-Styx, whose hybrid architecture extends Styx with a snapshot-based Query Engine, enabling near-real-time OLAP queries over global state while maintaining performance isolation for transactions. The Query Engine integrates seamlessly into the Styx architecture, leveraging periodic snapshots transmitted via a loosely-coupled, asynchronous interface. It ingests partitioned state from object store MinIO into columnar database DuckDB, supports incremental delta loads, and delivers results over a Kafka-based interface to achieve scalable, low-latency analytical querying while employing robust fault tolerance.

Empirical evaluation demonstrates that H-Styx preserves transactional throughput and latency under hybrid workloads, while significantly outperforming a baseline HTAP architecture (Postgres with Streaming Replication) on analytical throughput and providing superior workload isolation. These results validate the feasibility of supporting hybrid transactional and analytical processing in SFaaS environments. Overall, H-Styx bridges a crucial capability gap in SFaaS, enabling more powerful data-driven applications in distributed, event-driven architectures.

Files

Thesis_Report_H_Styx.pdf
(pdf | 3.51 Mb)
License info not available