Performance engineering and energy efficiency of building blocks for large, sparse eigenvalue computations on heterogeneous supercomputers

None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None

Performance engineering and energy efficiency of building blocks for large, sparse eigenvalue computations on heterogeneous supercomputers

Conference Paper (2016)

Author(s)

Moritz Kreutzer (Friedrich-Alexander-Universität Erlangen-Nürnberg)

Jonas Thies (Deutsches Zentrum für Luft- und Raumfahrt (DLR))

Andreas Pieper (Greifswald University)

Andreas Alvermann (Greifswald University)

Martin Galgon (Bergische Universität Wuppertal )

Melven Röhrig-Zöllner (Deutsches Zentrum für Luft- und Raumfahrt (DLR))

Faisal Shahzad (Friedrich-Alexander-Universität Erlangen-Nürnberg)

Achim Basermann (Deutsches Zentrum für Luft- und Raumfahrt (DLR))

Alan R. Bishop (Los Alamos National Laboratory)

Holger Fehske (Greifswald University)

Georg Hager (Friedrich-Alexander-Universität Erlangen-Nürnberg)

Bruno Lang (Bergische Universität Wuppertal )

Gerhard Wellein (Friedrich-Alexander-Universität Erlangen-Nürnberg)

Affiliation

External organisation

DOI related publication

https://doi.org/10.1007/978-3-319-40528-5_14 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:2e753f45-0a4d-4986-8e17-60798baa9a1a

More Info

expand_more

Publication Year

2016

Language

English

Affiliation

External organisation

Pages (from-to)

317-338

Publisher

Springer

ISBN (print)

9783319405261

Event

International Conference on Software for Exascale Computing, SPPEXA 2015 (2016-01-25 - 2016-01-27), Munich, Germany

Downloads counter

228

Abstract

Numerous challenges have to be mastered as applications in scientific computing are being developed for post-petascale parallel systems. While ample parallelism is usually available in the numerical problems at hand, the efficient use of supercomputer resources requires not only good scalability but also a verifiably effective use of resources on the core, the processor, and the accelerator level. Furthermore, power dissipation and energy consumption are becoming further optimization targets besides time-to-solution. Performance Engineering (PE) is the pivotal strategy for developing effective parallel code on all levels of modern architectures. In this paper we report on the development and use of low-level parallel building blocks in the GHOST library (“General, Hybrid, and Optimized Sparse Toolkit”). We demonstrate the use of PE in optimizing a density of states computation using the Kernel Polynomial Method, and show that reduction of runtime and reduction of energy are literally the same goal in this case. We also give a brief overview of the capabilities of GHOST and the applications in which it is being used successfully.