Experimental Performance Analysis of Graph Analytics Frameworks

More Info
expand_more

Abstract

Big data, the large-scale collection and analysis of data, has become ubiquitous in the modern, digital society. Within the big data landscape, graphs are widely used to study collections of entities and the complex relationships that connect them. The analysis of graphs has applications in social networks, logistics, finance, bioinformatics, and many other domains. With the rapidly increasing amounts of data being collected, analyzing large-scale graphs has become a necessity. To address this need, many dedicated graph analysis frameworks have been developed in recent years. However, their performance is poorly understood. In this thesis, our goal is to improve insight into the performance of graph analysis frameworks. We design the Graphalytics ecosystem, a set of complementary systems for understanding the performance of graph analysis frameworks, with a focus on two key components. First, we design, implement, and evaluate Graphalytics, a comprehensive benchmark for graph analysis frameworks that facilitates the comparison of performance between these frameworks. Second, we design, implement, and evaluate Grade10, a system for automated, in-depth performance analysis of graph analysis frameworks. Through experimental evaluation of the Graphalytics ecosystem, we gain insight into the performance of six modern graph analysis frameworks.