🤖 AI Summary
Manual discovery of speculative-execution vulnerabilities (e.g., Spectre) is costly, requires deep hardware expertise, and lacks generalizability. Method: This work introduces reinforcement learning to post-silicon, black-box microprocessor vulnerability discovery—its first application in this domain. We propose an end-to-end, source-code-free, hardware-model-free instruction-level exploration framework that integrates Deep Q-Networks (DQN), hardware-aware instruction sequence modeling, and side-channel feedback–driven reward design to autonomously search for vulnerable execution paths and synthesize exploit chains. Contribution/Results: The framework successfully reproduces multiple Spectre variants on real x86 and RISC-V processors, achieving a 3–5× improvement in vulnerability discovery efficiency. It is the first to generate full exploits automatically without any prior knowledge of target microarchitectural details, establishing a scalable, automated paradigm for hardware security validation.
📝 Abstract
Speculative attacks such as Spectre can leak secret information without being discovered by the operating system. Speculative execution vulnerabilities are finicky and deep in the sense that to exploit them, it requires intensive manual labor and intimate knowledge of the hardware. In this paper, we introduce SpecRL, a framework that utilizes reinforcement learning to find speculative execution leaks in post-silicon (black box) microprocessors.