ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

Current CPU architecture-level power modeling faces two critical bottlenecks: (1) the absence of publicly available, realistic, fine-grained open-source datasets, and (2) the inability of conventional synthetic data generation methods to faithfully replicate real-world CPU design flows. To address these challenges, we introduce CPUDataset—the first open-source dataset specifically designed for modern CPU architecture-level power modeling. It comprises 200 samples across 25 architectural configurations and 8 diverse workloads. Each sample includes over 100 RTL-level architectural features and ground-truth power labels decomposed into four components: combinational logic, sequential logic, memory units, and clock networks. Data generation strictly follows industrial CPU design practices, integrating RTL modeling with reproducible, cycle-accurate power simulation. CPUDataset is publicly released on GitHub. Empirical evaluation demonstrates significantly improved power prediction accuracy during early design stages, establishing the first standardized benchmark for machine learning–driven architecture-level power modeling.

Technology Category

Application Category

📝 Abstract

Power is the primary design objective of large-scale integrated circuits (ICs), especially for complex modern processors (i.e., CPUs). Accurate CPU power evaluation requires designers to go through the whole time-consuming IC implementation process, easily taking months. At the early design stage (e.g., architecture-level), classical power models are notoriously inaccurate. Recently, ML-based architecture-level power models have been proposed to boost accuracy, but the data availability is a severe challenge. Currently, there is no open-source dataset for this important ML application. A typical dataset generation process involves correct CPU design implementation and repetitive execution of power simulation flows, requiring significant design expertise, engineering effort, and execution time. Even private in-house datasets often fail to reflect realistic CPU design scenarios. In this work, we propose ArchPower, the first open-source dataset for architecture-level processor power modeling. We go through complex and realistic design flows to collect the CPU architectural information as features and the ground-truth simulated power as labels. Our dataset includes 200 CPU data samples, collected from 25 different CPU configurations when executing 8 different workloads. There are more than 100 architectural features in each data sample, including both hardware and event parameters. The label of each sample provides fine-grained power information, including the total design power and the power for each of the 11 components. Each power value is further decomposed into four fine-grained power groups: combinational logic power, sequential logic power, memory power, and clock power. ArchPower is available at https://github.com/hkust-zhiyao/ArchPower.

Problem

Research questions and friction points this paper is trying to address.

Lack of open-source dataset for ML-based CPU power modeling

Inaccurate classical power models at early CPU design stages

Time-consuming and expertise-intensive process for power dataset generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source dataset for CPU power modeling

Includes 200 samples from 25 CPU configurations

Provides fine-grained power breakdown across components

🔎 Similar Papers

Blink: Fast Automated Design of Run-Time Power Monitors on FPGA-Based Computing Platforms

2024-07-31International Conference on Electronics, Circuits, and SystemsCitations: 0

Nvidia

168,000 USD - 264,500 USD for Level 4, and 196,000 USD - 310,500 USD for Level 5

US, CA, Santa Clara

Machine Learning Performance Modeling Architect