🤖 AI Summary
Processor thermal design power (TDP) is widely misused as a proxy for actual power consumption in physics simulations, leading to inaccurate energy-efficiency assessments. Method: This study conducts the first empirical power and energy measurements of major production-scale physics simulation codes on heterogeneous exascale supercomputers at LLNL and Sandia. Leveraging multi-granularity energy modeling, cross-platform benchmarking, and real-time monitoring across commercial and advanced CPU–GPU heterogeneous nodes, it systematically quantifies runtime energy efficiency. Contribution/Results: Under typical simulation workloads, measured power draw is only 30–60% of TDP—substantially lower than nominal ratings. This work challenges the longstanding practice of substituting TDP for measured power, establishing an empirically grounded methodology for evaluating energy efficiency in exascale systems. It provides critical, reproducible, and generalizable energy benchmarks to guide hardware deployment and energy-aware optimization, thereby advancing low-carbon scientific computing.
📝 Abstract
Power is an often-cited reason for moving to advanced architectures on the path to Exascale computing. This is due to the practical concern of delivering enough power to successfully site and operate these machines, as well as concerns over energy usage while running large simulations. Since accurate power measurements can be difficult to obtain, processor thermal design power (TDP) is a possible surrogate due to its simplicity and availability. However, TDP is not indicative of typical power usage while running simulations. Using commodity and advance technology systems at Lawrence Livermore National Laboratory (LLNL) and Sandia National Laboratory, we performed a series of experiments to measure power and energy usage in running simulation codes. These experiments indicate that large scale LLNL simulation codes are significantly more efficient than a simple processor TDP model might suggest.