🤖 AI Summary
This study addresses the escalating energy consumption of deep learning models driven by their increasing scale and complexity, highlighting an urgent need for sustainable optimization strategies. It presents the first systematic investigation into how hyperparameters—such as training epochs and learning rate—affect energy usage. By designing hyperparameter mutation operators and integrating real-time energy monitoring with performance metrics, the authors conduct experiments across both single-model and parallel training scenarios. Empirical results on five real-world models demonstrate that most hyperparameters are significantly correlated with energy consumption. Crucially, appropriate hyperparameter tuning can substantially reduce energy use without compromising model performance, with parallel training environments exhibiting heightened sensitivity to hyperparameter-induced energy variations. This work thus offers a novel perspective and empirical foundation for advancing green deep learning practices.
📝 Abstract
Context: Along with developing Deep learning (DL) models, larger datasets and more complex model structures are applied, leading to rising computing resources and energy consumption, which is an alert that green DL models should receive more attention. Objective: This paper focuses on a novel view to analyze DL energy consumption: the effect of hyperparameters on the energy cost of DL models. Method: Our approach involves using mutation operators to simulate how practitioners adjust hyperparameters, such as epochs and learning rates. We train the original and mutated models separately and gather energy information and run-time performance metrics. Moreover, we focus on the parallel scenario where multiple DL models are trained in parallel. Results: To examine the effect of hyperparameters on energy consumption, we conducted extensive experiments on five real-world DL models. The results show that (1) many hyperparameters studied have a (positive or negative) correlation with energy consumption, (2) adjusting hyperparameters can make DL models greener, i.e., lead to less energy consumption without performance damage, and (3) in a parallel environment, energy consumption becomes more susceptible to change. Conclusions: We suggest that hyperparameters need more attention in developing DL models, as appropriately adjusting hyperparameters would cause green DL models.