Stacking High-Level Fuzz Mutations in Big Data Applications

Bachelor Thesis (2021)
Authors

M.W.M. Oudemans (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Supervisors

Burcu Kulahcioglu Ozkan (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Copyright
© 2021 Melchior Oudemans
More Info
expand_more
Publication Year
2021
Language
English
Copyright
© 2021 Melchior Oudemans
Graduation Date
02-07-2021
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The big data technology market size is expected to grow in the coming years. The advantages of having automated test tools for big data applications are becoming increasingly important. Fuzzing is an automated testing method which has been used in many different fields, but has not been frequently used in the big data domain as it poses several challenges. BigFuzz, a new method which was proposed by a recent study, solves these problems and shows promising results. One of the BigFuzz contributions are high-level mutations, which are error type guided and schema aware mutations. This paper is answering the question: How does stacking high-level fuzz mutations affect the test performance for big data applications? It does so by creating different stacking strategies and evaluating the effect compared to the BigFuzz method. As evaluation metrics the research looks at the amount of unique failures per trial and the distribution of unique failures found. The three stacking strategies that have been developed for this project are: Random Stack, Smart Stack and Single Stack. This research has shown that there appear to be benefits to stacking high-level mutations. The results show that stacking algorithms find on average more unique failures in less trials than a non-stacking approach. Furthermore, is Smart Stack able to find unique failures more frequently. Empirical results suggest that stacking high-level mutations can provide an advantage over only mutating once.

Files

Research_Paper_FINAL.pdf
(pdf | 0.379 Mb)
License info not available