AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications

📅 2022-03-04
🏛️ Micro
📈 Citations: 14
Influential: 1
📄 PDF
🤖 AI Summary
Modern datacenter low-latency services (e.g., Memcached, MySQL) suffer from excessive wake-up latency incurred by deep CPU idle states (C-states), undermining energy efficiency. This paper proposes C6A and its enhanced variant C6AE—a novel deep-idle architecture that breaks the conventional power–latency trade-off. Its three core innovations are: (1) medium-granularity power gating, (2) leakage-optimized, non-power-gated L1/L2 cache retention, and (3) fully digital PLL with always-on clock locking—enabling stable operation at minimum supply voltage. Evaluation shows C6A/C6AE reduces wake-up latency by 900× and cuts idle power to just 7%/5% of active power. For Memcached, system-wide energy consumption drops by up to 71% (35% on average), with end-to-end performance degradation under 1%. These advances significantly improve energy efficiency in latency-critical workloads.
📝 Abstract
User-facing applications running in modern datacenters exhibit irregular request patterns and are implemented using a multitude of services with tight latency requirements (30–250$mu$s). These characteristics render existing energy-conserving techniques ineffective when processors are idle due to the long transition time (order of 100$mu$s) from a deep CPU core idle power state (C-state). While prior works propose management techniques to mitigate this inefficiency, we tackle it at its root with AgileWatts (AW): a new deep CPU core C-state architecture optimized for datacenter server processors targeting latency-sensitive applications.AW drastically reduces the transition latency from deep CPU core idle power states while retaining most of their power savings based on three key ideas. First, AW eliminates the latency (several microseconds) of savinglrestoring the core context when powering-off/-on the core in a deep idle state by i) implementing medium-grained power-gates, carefully distributed across the CPU core, and ii) reraining context in the power-ungated domain. Second, AW eliminates rhe flush latency (several tens of microseconds) of the LllL2 caches when entering a deep idle state by keeping LllL2 content power-ungated. A small control logic also remains ungated to serve cache coherence traffic. AW implements cache sleep-mode and leakage reduction for the power-ungated domain by lowering a core’s voltage to the minimum operational level. Third, using a state-of-the-art power efficient all-digital phase-locked loop (ADPLL) clock generator, AW keeps the PLL active and locked during the idle state, cutting microseconds of wake-up latency at negligible power cost.Our evaluation with an accurate industrial-grade simulator calibrated against an Intel Skylake server shows that AW reduces the energy consumprion of Memcached by up to 71% (35% on average) with<1% end-to-end performance degradation. We observe similar trends for other evaluated services (MySQL and Kafka). AW’s new deep C-states C6A and C6AE reduce transition-time by up to 900$ imes$ as compared to the deepest existing idle state C6, while consuming only 7% and 5% of the active state (C0) power, respectively.
Problem

Research questions and friction points this paper is trying to address.

Optimizes deep C-state for latency-sensitive applications
Reduces L1/L2 cache flush latency overhead
Minimizes wake-up latency with active PLL during idle
Innovation

Methods, ideas, or system contributions that make the work stand out.

Implements medium-grained power-gates
Retains L1/L2 cache content
Uses power-efficient ADPLL clock generator
🔎 Similar Papers
No similar papers found.
J
J. Yahya
Huawei Technologies
Haris Volos
Haris Volos
University of Cyprus
Operating SystemsComputer ArchitecturePersistent Memory
D
D. Bartolini
Huawei Technologies
G
Georgia Antoniou
University of Cyprus
Jeremie S. Kim
Jeremie S. Kim
Carnegie Mellon University
Computer ArchitectureBioinformatics
Z
Zhe Wang
Huawei Technologies
K
Kleovoulos Kalaitzidis
Huawei Technologies
T
Tom Rollet
Huawei Technologies
Z
Zhi-Rui Chen
Huawei Technologies
Y
Ye Geng
Huawei Technologies
O
O. Mutlu
ETH Zurich
Y
Yiannakis Sazeides
University of Cyprus