Experimental Performance Analysis of Graph Analytics Frameworks
Tim Hegeman (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Alex Iosup – Mentor
Jan S. Rellermeyer – Graduation committee member
Andy Zaidman – Graduation committee member
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Big data, the large-scale collection and analysis of data, has become ubiquitous in the modern, digital society. Within the big data landscape, graphs are widely used to study collections of entities and the complex relationships that connect them. The analysis of graphs has applications in social networks, logistics, finance, bioinformatics, and many other domains. With the rapidly increasing amounts of data being collected, analyzing large-scale graphs has become a necessity. To address this need, many dedicated graph analysis frameworks have been developed in recent years. However, their performance is poorly understood. In this thesis, our goal is to improve insight into the performance of graph analysis frameworks. We design the Graphalytics ecosystem, a set of complementary systems for understanding the performance of graph analysis frameworks, with a focus on two key components. First, we design, implement, and evaluate Graphalytics, a comprehensive benchmark for graph analysis frameworks that facilitates the comparison of performance between these frameworks. Second, we design, implement, and evaluate Grade10, a system for automated, in-depth performance analysis of graph analysis frameworks. Through experimental evaluation of the Graphalytics ecosystem, we gain insight into the performance of six modern graph analysis frameworks.