Stacking High-Level Fuzz Mutations in Big Data Applications

Bachelor thesis (2021)

Authors

M.W.M. Oudemans Electrical Engineering, Mathematics and Computer Science

Contributors

Burcu Kulahcioglu Ozkan Software Engineering - (mentor)

Jérémie Decouchant Data-Intensive Systems - (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:303a0423-a43f-4f2b-a557-574f0f7151d4

Published Date

02-07-2021

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The big data technology market size is expected to grow in the coming years. The advantages of having automated test tools for big data applications are becoming increasingly important. Fuzzing is an automated testing method which has been used in many different fields, but has not been frequently used in the big data domain as it poses several challenges. BigFuzz, a new method which was proposed by a recent study, solves these problems and shows promising results. One of the BigFuzz contributions are high-level mutations, which are error type guided and schema aware mutations. This paper is answering the question: How does stacking high-level fuzz mutations affect the test performance for big data applications? It does so by creating different stacking strategies and evaluating the effect compared to the BigFuzz method. As evaluation metrics the research looks at the amount of unique failures per trial and the distribution of unique failures found. The three stacking strategies that have been developed for this project are: Random Stack, Smart Stack and Single Stack. This research has shown that there appear to be benefits to stacking high-level mutations. The results show that stacking algorithms find on average more unique failures in less trials than a non-stacking approach. Furthermore, is Smart Stack able to find unique failures more frequently. Empirical results suggest that stacking high-level mutations can provide an advantage over only mutating once.

Files

Research_Paper_FINAL.pdf

(.pdf | 0.379 Mb)