🤖 AI Summary
To address low utilization of deep idle states in latency-sensitive applications, this paper identifies a significant gap between theoretically available idle opportunities and their actual exploitation—caused by inaccurate idle scheduling decisions and non-negligible deep-sleep transition latency. We propose a queueing-theoretic modeling framework that integrates M/M/1, c×M/M/1, and M/M/c models, calibrated with real-world server workload traces, to quantify system-level idle potential under diverse configurations. For the first time, we systematically identify numerous untriggered deep-idle entry opportunities and develop a scalable methodology for idle-efficiency evaluation. The framework provides quantifiable, early-stage guidance for hardware–OS co-design, enabling energy-efficiency optimization and supporting system-level power management strategies that explicitly balance latency constraints and energy savings.
📝 Abstract
This work introduces a model-based framework that reveals the idle opportunity of modern servers running latency-critical applications. Specifically, three queuing models, M/M/1, cxM/M/1, and M/M/c, are used to estimate the theoretical idle time distribution at the CPU core and system (package) level. A comparison of the actual idleness of a real server and that from the theoretical models reveals significant missed opportunities to enter deep idle states. This inefficiency is attributed to the idle-governor inaccuracy and the high latency to transition to/from legacy deep-idle states. The proposed methodology offers the means for an early-stage design exploration and insights into idle time behavior and opportunities for varying server system configurations and load.