Benchmarking Distributed Database Performance and Dependability under Partial System Failures

Bes, R.L.

Benchmarking Distributed Database Performance and Dependability under Partial System Failures

Master thesis (2021)

Authors

R.L. Bes Electrical Engineering, Mathematics and Computer Science

Contributors

Asterios Katsifodimos Web Information Systems (mentor)

Marios Fragkoulis Web Information Systems (mentor)

Angel Bravo (mentor)

G. J. Houben Web Information Systems (graduation committee member)

Maurício Aniche Software Engineering (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Performance Evaluation Benchmarking Dependability Distributed Database Systems Fault Injection

To reference this document use:

http://resolver.tudelft.nl/uuid:20b15e7e-2247-4667-ada4-1a3ad7d05aaa

More Info

expand_more

Published Date

14-04-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Many types of database management systems exist, but finding the one that is right for a specific use case is becoming increasingly more difficult. Benchmarks allow one to compare various systems, but in a world where distributed DBMSs are increasingly used for mission critical purposes, we find most existing benchmarks neglect fault tolerance and dependability aspects. In this Master’s Thesis, we design a modular and highly extensible framework capable of introducing partial system failures in a distributed database deployment. We also implement a proof-of-concept version of our framework which we use to evaluate the performance of a CockroachDB cluster deployed through Kubernetes, by running the TPC-C benchmark while we inject faults and measure changes in performance. Using this proof-of-concept implementation we demonstrate the faults our system can introduce and find that the impact of our high-level node failures is strongly dependent on the time a node has to perform a graceful shutdown and notify its peers or connected clients.

Files

Thesis_Ruben_Bes.pdf

(pdf | 0.911 Mb)

License info not available