🤖 AI Summary
Existing methodologies—such as SPEC CPU2017, Design of Experiments (DoE), and Randomized Controlled Trials (RCTs)—struggle to accurately attribute overall system performance to individual hardware components due to their inability to effectively isolate component-level contributions, resulting in substantial evaluation variability (SPEC score deviations ranging from 12.16% to 436.80%). This work proposes a novel methodology that integrates controlled experimentation with a theoretical attribution model, enabling, for the first time, precise and stable attribution of system performance to specific hardware components. The proposed approach significantly outperforms conventional techniques, offering high cost-effectiveness while overcoming inherent limitations in component evaluation and system design present in current practices.
📝 Abstract
In a computer system, multiple indispensable components-such as the CPU, memory, and others-work together with other essential components to produce an overall effect, which can only be measured on an independently running system. Since the system operates as an integrated whole, isolating the effect of individual components is challenging. Accurately attributing the system's overall effect to its specific component is crucial for both computer design and evaluation.
Taking CPU evaluation as a benchmark, our experiments reveal that the general-purpose rigorous methodologies, like DoE, RCTs, can not address this issue efficiently; A single-purpose empirical methodology, SPEC CPU2017, which is the industry-standard CPU benchmark, only reports the overall effect. Even more concerningly, for the identical CPU, the undefined configurations of other indispensable components introduce uncontrolled variability, with the SPEC scores fluctuating from 12.16\% to 436.80\%.
We propose a rigorous methodology that can attribute the overall effect to its specific component, which can be utilized in computer component evaluations and design, as well as in other areas. Through theoretical analysis and pioneering controlled experiments, we systematically compare our methodology against three established methodologies: SPEC CPU2017, DoE, and RCTs. The results show our methodology can achieve its goal in a cost-efficient way, while others exhibit inherent limitations.