Design and Experimental Evaluation of Distributed Heterogeneous Graph-Processing Systems

Conference Paper (2016)
Author(s)

Yong Guo (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Ana Lucia Varbanescu (Universiteit van Amsterdam)

Dick Epema (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Alex Iosup (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Data-Intensive Systems
DOI related publication
https://doi.org/10.1109/ccgrid.2016.53 Final published version
More Info
expand_more
Publication Year
2016
Language
English
Research Group
Data-Intensive Systems
Pages (from-to)
1-10
ISBN (electronic)
978-1-5090-2453-7
Event
CCGRID 2016 (2016-05-16 - 2016-05-19), Cartagena de Indias, Cartagena, Colombia
Downloads counter
360
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Graph processing is increasingly used in a variety of domains, from engineering to logistics and from scientific computing to online gaming. To process graphs efficiently, GPU-enabled graph-processing systems such as TOTEM and Medusa exploit the GPU or the combined CPU+GPU capabilities of a single machine. Unlike scalable distributed CPU-based systems such as Pregel and GraphX, existing GPU-enabled systems are restricted to the resources of a single machine, including the limited amount of GPU memory, and thus cannot analyze the increasingly large-scale graphs we see in practice. To address this problem, we design and implement three families of distributed heterogeneous graph-processing systems that can use both the CPUs and GPUs of multiple machines. We further focus on graph partitioning, for which we compare existing graph-partitioning policies and a new policy specifically targeted at heterogeneity. We implement all our distributed heterogeneous systems based on the programming model of the single-machine TOTEM, to which we add (1) a new communication layer for CPUs and GPUs across multiple machines to support distributed graphs, and (2) a workload partitioning method that uses offline profiling to distribute the work on the CPUs and the GPUs. We conduct a comprehensive real-world performance evaluation for all three families. To ensure representative results, we select 3 typical algorithms and 5 datasets with different characteristics. Our results include algorithm run time, performance breakdown, scalability, graph partitioning time, and comparison with other graph-processing systems. They demonstrate the feasibility of distributed heterogeneous graph processing and show evidence of the high performance that can be achieved by combining CPUs and GPUs in a distributed environment.

Files

CCGRID_2016_paper_53.pdf
(pdf | 0.365 Mb)
License info not available