Estimating Deep Learning energy consumption based on model architecture and training environment

None, None; None, None; None, None; None, None

Estimating Deep Learning energy consumption based on model architecture and training environment

Journal Article (2027)

Author(s)

Santiago del Rey (Universitat Politécnica de Catalunya)

Luís Cruz (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Xavier Franch (Universitat Politécnica de Catalunya)

Silverio Martínez-Fernández (Universitat Politécnica de Catalunya)

Research Group

Software Engineering

Neural networks Empirical software engineering Energy consumption estimation Green in AI Sustainable computing

DOI related publication

https://doi.org/10.1016/j.csi.2026.104170 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:e3ec96f6-3675-4184-9c83-eb5c988f5b39

More Info

expand_more

Publication Year

2027

Language

English

Research Group

Software Engineering

Journal title

Computer Standards and Interfaces

Volume number

99

Article number

104170

Downloads counter

33

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

To raise awareness of the environmental impact of deep learning (DL), numerous studies have estimated the energy consumption of DL systems. However, energy estimates during DL training often rely on unverified assumptions. This work addresses that gap by investigating how model architecture and training environment affect energy consumption. We train a variety of computer vision models and collect energy consumption and accuracy metrics to analyze their trade-offs across configurations. Our results show that selecting the right model–training environment combination can reduce training energy consumption by up to 80.68% with less than 2% loss in F1 score. We find a significant interaction effect between model and training environment: energy efficiency improves when GPU computational power scales with model complexity. Moreover, we demonstrate that common estimation practices, such as using FLOPs or GPU TDP, fail to capture these dynamics and can lead to substantial errors. To address these shortcomings, we propose the Stable Training Epoch Projection (STEP) and the Pre-training Regression-based Estimation (PRE) methods. Our evaluation demonstrates that STEP and PRE achieve reductions in Root Mean Squared Error (RMSE) up to 97% and 84%, respectively, when compared to existing estimation tools.

Files

1-s2.0-S0920548926000449-main.... (pdf)

(pdf | 2.98 Mb)