Efficient High Performance Computing on Heterogeneous Platforms

More Info
expand_more

Abstract

Heterogeneous platforms are mixes of different processing units in a compute node (e.g., CPUs+GPUs, CPU+MICs) or a chip package (e.g., APUs). This type of platforms keeps gaining popularity in various computer systems ranging from supercomputers to mobile devices. In this context, improving their efficiency and usability has become increasingly important. In this thesis, we develop systematic methods for a large variety of data parallel applications to efficiently utilize heterogeneous platforms. Specifically, (1) we evaluate the suitability of OpenCL as a unified programming model for heterogeneous computing and improve OpenCL's efficiency for programming heterogenous platforms; (2) we develop a workload partitioning framework to accelerate imbalanced applications on heterogeneous platforms, where we match the heterogeneity of the platform with the imbalance of the workload; (3) we propose a model-based prediction method to correctly and quickly predict the optimal workload partitioning, maximizing the performance gain while speeding up the partitioning process; (4) we generalize a systematic workload partitioning approach which improves performance for both balanced and imbalanced applications, for applications with different datasets and execution scenarios, and for platforms with different hardware mixes; (5) we design an application analyzer that analyzes application kernel structures and enables different partitioning strategies accordingly to obtain both high performance and wide applicability for workload partitioning on heterogeneous platforms. To summarize, this thesis demonstrates that heterogeneous platforms are the right solution, performance-wise, for many classes of data parallel applications, and shows how high performance can be achieved systematically.