🤖 AI Summary
Current microprocessor evaluation relies heavily on inefficient, non-representative cycle-accurate simulators. This paper introduces Neutrino, a deep learning–based high-fidelity “in-the-wild” simulation framework enabling cycle-level performance prediction of hypothetical microarchitectures on commodity hardware. Methodologically, Neutrino employs microarchitecture-agnostic feature modeling to ensure cross-generational design transferability; implements a lightweight hardware trace collection and systematic sampling strategy—enabling low-overhead A/B testing in production environments (just 0.1% performance overhead) and scalable deployment; and co-designs an on-chip accelerator that achieves 5 MIPS baseline simulation throughput on GPU, with the accelerator delivering an additional 85× speedup. Collectively, these contributions significantly accelerate processor design iteration and enable efficient, scalable hardware evaluation under realistic workloads.
📝 Abstract
The evaluation of new microprocessor designs is constrained by slow, cycle-accurate simulators that rely on unrepresentative benchmark traces. This paper introduces a novel deep learning framework for high-fidelity, ``in-the-wild'' simulation on production hardware. Our core contribution is a DL model trained on microarchitecture-independent features to predict cycle-level performance for hypothetical processor designs. This unique approach allows the model to be deployed on existing silicon to evaluate future hardware. We propose a complete system featuring a lightweight hardware trace collector and a principled sampling strategy to minimize user impact. This system achieves a simulation speed of 5 MIPS on a commodity GPU, imposing a mere 0.1% performance overhead. Furthermore, our co-designed Neutrino on-chip accelerator improves performance by 85x over the GPU. We demonstrate that this framework enables accurate performance analysis and large-scale hardware A/B testing on a massive scale using real-world applications.