Energy Optimization for Large-Scale 3D Manycores in the Dark-Silicon Era

Journal Article (2019)
Author(s)

Sohaib Majzoub (University of Sharjah, Sharjah)

Resve A. Saleh (University of British Columbia)

Imran Ashraf (TU Delft - Computer Engineering)

Mottaqiallah Taouil (TU Delft - Computer Engineering)

S. Hamdioui (TU Delft - Computer Engineering)

Research Group
Computer Engineering
Copyright
© 2019 Sohaib Majzoub, Resve A. Saleh, I. Ashraf, M. Taouil, S. Hamdioui
DOI related publication
https://doi.org/10.1109/ACCESS.2019.2900477
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Sohaib Majzoub, Resve A. Saleh, I. Ashraf, M. Taouil, S. Hamdioui
Research Group
Computer Engineering
Volume number
7
Pages (from-to)
33115-33129
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this paper, we study the impact of the idle/dynamic power consumption ratio on the effectiveness of a multi-V dd /frequency manycore design. We propose a new tool called LVSiM (a Low-Power and Variation-Aware Manycore Simulator) to carry out the experiments. It is a novel manycore simulator targeted towards low-power optimization methods including within-die process and workload variations. LVSiM provides a holistic platform for multi-V dd /frequency voltage island analysis, optimization, and design. It provides a tool for the early design exploration stage to analyze large-scale manycores with a given number of cores on 3D-stacked layers, network-on-chip communication busses, technology parameters, voltage and frequency values, and power grid parameters, using a variety of different optimization methods. LVSiM has been calibrated with Sniper/McPAT at a nominal frequency, and then the energy-delay-product (EDP) numbers were compared after frequency scaling. The average error is shown to be 10% after frequency scaling, which is sufficient for our purposes. The experiments in this work are carried out for different Idle/Dynamic ratios considering 1260 benchmarks with task sizes ranging from 4000 to 16 000 executing on 3200 cores. The best configurations are shown to produce on average 20.7% to 24.6% EDP savings compared to the nominal configuration. Traditional scheduling methods are used in the nominal configuration with the unused cores switched off. In addition, we show that, as the Idle/Dynamic ratio increases, the multi-V dd /frequency approach becomes less effective. In the case of a high Idle/Dynamic ratio, the minimum EDP can be achieved through switching off unused cores as opposed to using a multi-V dd /frequency approach. This conclusion is important, especially in the dark-silicon era, where switching cores on and/or off as needed is a common practice.