Stacking High-Level Fuzz Mutations in Big Data Applications

More Info
expand_more

Abstract

The big data technology market size is expected to grow in the coming years. The advantages of having automated test tools for big data applications are becoming increasingly important. Fuzzing is an automated testing method which has been used in many different fields, but has not been frequently used in the big data domain as it poses several challenges. BigFuzz, a new method which was proposed by a recent study, solves these problems and shows promising results. One of the BigFuzz contributions are high-level mutations, which are error type guided and schema aware mutations. This paper is answering the question: How does stacking high-level fuzz mutations affect the test performance for big data applications? It does so by creating different stacking strategies and evaluating the effect compared to the BigFuzz method. As evaluation metrics the research looks at the amount of unique failures per trial and the distribution of unique failures found. The three stacking strategies that have been developed for this project are: Random Stack, Smart Stack and Single Stack. This research has shown that there appear to be benefits to stacking high-level mutations. The results show that stacking algorithms find on average more unique failures in less trials than a non-stacking approach. Furthermore, is Smart Stack able to find unique failures more frequently. Empirical results suggest that stacking high-level mutations can provide an advantage over only mutating once.