GC
G.C. Christodoulou
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
3 records found
1
Master thesis
(2025)
-
Smruti Kshirsagar, A. Katsifodimos, K. Psarakis, G.C. Christodoulou, G. Iosifidis, B. Özkan
Stateful Functions-as-a-Service (SFaaS) platforms, such as Styx, are emerging as powerful abstractions for building distributed, serverless cloud applications. By combining the abilities of FaaS with strong transactional guarantees, they enable complex, stateful workflows without requiring developers to manage infrastructure. However, they lack built-in support for analytical queries across distributed function state. This thesis addresses that gap by proposing H-Styx, whose hybrid architecture extends Styx with a snapshot-based Query Engine, enabling near-real-time OLAP queries over global state while maintaining performance isolation for transactions. The Query Engine integrates seamlessly into the Styx architecture, leveraging periodic snapshots transmitted via a loosely-coupled, asynchronous interface. It ingests partitioned state from object store MinIO into columnar database DuckDB, supports incremental delta loads, and delivers results over a Kafka-based interface to achieve scalable, low-latency analytical querying while employing robust fault tolerance.
Empirical evaluation demonstrates that H-Styx preserves transactional throughput and latency under hybrid workloads, while significantly outperforming a baseline HTAP architecture (Postgres with Streaming Replication) on analytical throughput and providing superior workload isolation. These results validate the feasibility of supporting hybrid transactional and analytical processing in SFaaS environments. Overall, H-Styx bridges a crucial capability gap in SFaaS, enabling more powerful data-driven applications in distributed, event-driven architectures. ...
Empirical evaluation demonstrates that H-Styx preserves transactional throughput and latency under hybrid workloads, while significantly outperforming a baseline HTAP architecture (Postgres with Streaming Replication) on analytical throughput and providing superior workload isolation. These results validate the feasibility of supporting hybrid transactional and analytical processing in SFaaS environments. Overall, H-Styx bridges a crucial capability gap in SFaaS, enabling more powerful data-driven applications in distributed, event-driven architectures. ...
Stateful Functions-as-a-Service (SFaaS) platforms, such as Styx, are emerging as powerful abstractions for building distributed, serverless cloud applications. By combining the abilities of FaaS with strong transactional guarantees, they enable complex, stateful workflows without requiring developers to manage infrastructure. However, they lack built-in support for analytical queries across distributed function state. This thesis addresses that gap by proposing H-Styx, whose hybrid architecture extends Styx with a snapshot-based Query Engine, enabling near-real-time OLAP queries over global state while maintaining performance isolation for transactions. The Query Engine integrates seamlessly into the Styx architecture, leveraging periodic snapshots transmitted via a loosely-coupled, asynchronous interface. It ingests partitioned state from object store MinIO into columnar database DuckDB, supports incremental delta loads, and delivers results over a Kafka-based interface to achieve scalable, low-latency analytical querying while employing robust fault tolerance.
Empirical evaluation demonstrates that H-Styx preserves transactional throughput and latency under hybrid workloads, while significantly outperforming a baseline HTAP architecture (Postgres with Streaming Replication) on analytical throughput and providing superior workload isolation. These results validate the feasibility of supporting hybrid transactional and analytical processing in SFaaS environments. Overall, H-Styx bridges a crucial capability gap in SFaaS, enabling more powerful data-driven applications in distributed, event-driven architectures.
Empirical evaluation demonstrates that H-Styx preserves transactional throughput and latency under hybrid workloads, while significantly outperforming a baseline HTAP architecture (Postgres with Streaming Replication) on analytical throughput and providing superior workload isolation. These results validate the feasibility of supporting hybrid transactional and analytical processing in SFaaS environments. Overall, H-Styx bridges a crucial capability gap in SFaaS, enabling more powerful data-driven applications in distributed, event-driven architectures.
Master thesis
(2025)
-
L. Van Mol, M. Schutte, G.C. Christodoulou, A. Katsifodimos, S.S. Chakraborty
Building scalable and consistent cloud applications is notoriously difficult due to the challenges of state management and execution consistency in distributed environments. Functions-as-a-Service (FaaS) platforms offer flexible scalability, but weak execution guarantees forces engineers to mix business logic with infrastructure concerns, adding error-handling code, retry mechanisms and consistency checks throughout their applications. At the same time, dataflow systems like Apache Flink offer exactly-once semantics, but their functional APIs often conflict with the imperative, object-oriented style preferred by mainstream developers.
This work aims to address this disconnect, arguing that modern transactional applications, from e-commerce to payment systems to business workflows, naturally form stateful dataflow graphs. By allowing developers to write familiar imperative code that executes on dataflow systems with strong consistency guarantees, we could eliminate the need to handle many infrastructure concerns explicitly.
To this end, we introduce Cascade, a compiler pipeline and intermediate representation that bridges the gap by translating imperative Python code into stateful, parallelizable dataflow graphs. Cascade extends prior work by providing a representation that is both expressive and optimizable, and we demonstrate optimizations including parallel execution via data dependency analysis and dynamic value prefetching. Our results show significant performance gains with these optimizations, all while maintaining the strong execution guarantees of the underlying execution target. Finally, we offer avenues for future research by discussing further optimization possibilities and extensions within our proposed framework. ...
This work aims to address this disconnect, arguing that modern transactional applications, from e-commerce to payment systems to business workflows, naturally form stateful dataflow graphs. By allowing developers to write familiar imperative code that executes on dataflow systems with strong consistency guarantees, we could eliminate the need to handle many infrastructure concerns explicitly.
To this end, we introduce Cascade, a compiler pipeline and intermediate representation that bridges the gap by translating imperative Python code into stateful, parallelizable dataflow graphs. Cascade extends prior work by providing a representation that is both expressive and optimizable, and we demonstrate optimizations including parallel execution via data dependency analysis and dynamic value prefetching. Our results show significant performance gains with these optimizations, all while maintaining the strong execution guarantees of the underlying execution target. Finally, we offer avenues for future research by discussing further optimization possibilities and extensions within our proposed framework. ...
Building scalable and consistent cloud applications is notoriously difficult due to the challenges of state management and execution consistency in distributed environments. Functions-as-a-Service (FaaS) platforms offer flexible scalability, but weak execution guarantees forces engineers to mix business logic with infrastructure concerns, adding error-handling code, retry mechanisms and consistency checks throughout their applications. At the same time, dataflow systems like Apache Flink offer exactly-once semantics, but their functional APIs often conflict with the imperative, object-oriented style preferred by mainstream developers.
This work aims to address this disconnect, arguing that modern transactional applications, from e-commerce to payment systems to business workflows, naturally form stateful dataflow graphs. By allowing developers to write familiar imperative code that executes on dataflow systems with strong consistency guarantees, we could eliminate the need to handle many infrastructure concerns explicitly.
To this end, we introduce Cascade, a compiler pipeline and intermediate representation that bridges the gap by translating imperative Python code into stateful, parallelizable dataflow graphs. Cascade extends prior work by providing a representation that is both expressive and optimizable, and we demonstrate optimizations including parallel execution via data dependency analysis and dynamic value prefetching. Our results show significant performance gains with these optimizations, all while maintaining the strong execution guarantees of the underlying execution target. Finally, we offer avenues for future research by discussing further optimization possibilities and extensions within our proposed framework.
This work aims to address this disconnect, arguing that modern transactional applications, from e-commerce to payment systems to business workflows, naturally form stateful dataflow graphs. By allowing developers to write familiar imperative code that executes on dataflow systems with strong consistency guarantees, we could eliminate the need to handle many infrastructure concerns explicitly.
To this end, we introduce Cascade, a compiler pipeline and intermediate representation that bridges the gap by translating imperative Python code into stateful, parallelizable dataflow graphs. Cascade extends prior work by providing a representation that is both expressive and optimizable, and we demonstrate optimizations including parallel execution via data dependency analysis and dynamic value prefetching. Our results show significant performance gains with these optimizations, all while maintaining the strong execution guarantees of the underlying execution target. Finally, we offer avenues for future research by discussing further optimization possibilities and extensions within our proposed framework.
MovR as a Benchmark for Geo-Distributed Databases
Performance Evaluation and Insights
Bachelor thesis
(2025)
-
W.P.A. Marcu, A. Katsifodimos, O. Mráz, G.C. Christodoulou, K. Psarakis, K.G. Langendoen
Distributed systems are vital for handling large-scale data and rely on geo-distributed databases to ensure low latency and high availability. Traditional benchmarks, such as TPC-C and YCSB-T, are not designed to handle the complexities of geo-distributed environments and do not allow for configuration of multi-home transaction ratios or dynamic data access patterns. To fill this gap, we implement a benchmark based on the MovR workload and assess its performance on the Detock, Janus, SLOG, and Calvin geo-distributed database systems. Key insights revealed through experiments are that network conditions act as a major bottleneck and high concurrency leads to unsustainable latency spikes which severely limits scalability.
...
Distributed systems are vital for handling large-scale data and rely on geo-distributed databases to ensure low latency and high availability. Traditional benchmarks, such as TPC-C and YCSB-T, are not designed to handle the complexities of geo-distributed environments and do not allow for configuration of multi-home transaction ratios or dynamic data access patterns. To fill this gap, we implement a benchmark based on the MovR workload and assess its performance on the Detock, Janus, SLOG, and Calvin geo-distributed database systems. Key insights revealed through experiments are that network conditions act as a major bottleneck and high concurrency leads to unsustainable latency spikes which severely limits scalability.