Benchmarking Distributed Database Performance and Dependability under Partial System Failures

Master Thesis (2021)
Author(s)

R.L. Bes (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A Katsifodimos – Mentor (TU Delft - Web Information Systems)

Marios Fragkoulis – Mentor (TU Delft - Web Information Systems)

Angel Bravo – Mentor

G.J. Houben – Graduation committee member (TU Delft - Web Information Systems)

Mauricio Aniche – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Ruben Bes
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Ruben Bes
Graduation Date
14-04-2021
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Data Science and Technology']
Sponsors
Adyen B.V.
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Many types of database management systems exist, but finding the one that is right for a specific use case is becoming increasingly more difficult. Benchmarks allow one to compare various systems, but in a world where distributed DBMSs are increasingly used for mission critical purposes, we find most existing benchmarks neglect fault tolerance and dependability aspects. In this Master’s Thesis, we design a modular and highly extensible framework capable of introducing partial system failures in a distributed database deployment. We also implement a proof-of-concept version of our framework which we use to evaluate the performance of a CockroachDB cluster deployed through Kubernetes, by running the TPC-C benchmark while we inject faults and measure changes in performance. Using this proof-of-concept implementation we demonstrate the faults our system can introduce and find that the impact of our high-level node failures is strongly dependent on the time a node has to perform a graceful shutdown and notify its peers or connected clients.

Files

Thesis_Ruben_Bes.pdf
(pdf | 0.911 Mb)
License info not available