Benchmarking checkpointing algorithms in Stream Processing Engines

Master thesis (2023)

Authors

G. Wiemers Electrical Engineering, Mathematics and Computer Science

Contributors

A Katsifodimos Web Information Systems - (supervisor 1)

G. Siachamis Web Information Systems - (supervisor 1)

Faculty

Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:1bb76ed6-668c-49d6-a89d-427ce279413f

Published Date

2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The use of data streams has increased a lot over the last two decades or so. and
With this increase comes the need for fast and consistent fault recovery. Rollback
recovery mechanisms from traditional distributed systems have been adapted successfully for stream engines. These mechanisms can be categorized into one of three different categories; uncoordinated, coordinated and communication induced protocols. While most well-known stream engines implement a variant of the coordinated Chandy-Lamport algorithm, there is no practical comparison available that actually confirms whether this is the optimal solution for data streams specifically. Compared to traditional distributed processing solutions, stream processing has a higher need for low latencies due to the continuously generated input. This paper aims to create more insight into the advantages and disadvantages of these solutions by implementing a checkpointing algorithm for each of these categories. These are then benchmarked using various workloads and evaluated using a number of metrics such as latency, throughput, recovery times and network overhead. From these results, it can be concluded that a coordinated approach indeed outperforms uncoordinated solutions across all of these metrics, most likely due to the need for message logging in both the uncoordinated and communication induced scenarios. Additionally the benchmarks indicate that the overhead of the communication induced approach does not outweigh its benefits, due to the rarity of the occurrence of the so-called domino effect.

Files

Thesis_gianni_wiemers_final_ve... (.pdf)

(.pdf | 0.918 Mb)