Testing Byzantine Fault Tolerant Algorithms
Evaluating the correctness of Tendermint protocol using ByzzFuzz
A.F. Nowakowski (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Burcu Kulahcioglu Ozkan – Mentor (TU Delft - Software Engineering)
J.M. Louro Neto – Mentor (TU Delft - Software Engineering)
Jérémie Decouchant – Graduation committee member (TU Delft - Data-Intensive Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The reliability of Byzantine Fault Tolerant (BFT) consensus protocols is critical for the robustness of modern distributed systems, i.e., in blockchain technologies. Testing of BFT protocols is crucial, as consequences of faults in their implementation can lead to malicious users exploiting vulnerabilities, resulting in financial losses, data corruption, or system unavailability. Such incidents, as seen in real-world attacks on blockchain systems, underscore the need for rigorous testing methodologies to ensure protocol correctness and resilience under adverse conditions.
This paper evaluates the implementation of the Tendermint protocol in the ByzzBench framework using ByzzFuzz, a testing approach for BFT consensus protocols. ByzzFuzz introduces structured mutations to simulate real-world fault scenarios, enabling the identification of incorrect behavior. The main question addressed in this study is: Can ByzzFuzz detect subtle protocol faults more effectively than baseline testing methods, and how do mutation strategies influence fault detection performance?
Through extensive testing, ByzzFuzz successfully uncovered violations in the Tendermint implementation, demonstrating its capability to detect subtle protocol faults. A comparative analysis with baseline testing methods revealed that ByzzFuzz provides greater fault coverage, identifying nuanced issues that the baseline approach missed. Furthermore, the study evaluated the effectiveness of small-scope and any-scope message mutations, where they change a value incrementally and arbitrarily respectively. This study found that small-scope mutations perform better in finding faults.