🤖 AI Summary
RTL simulation is critical in hardware design, yet software-based simulation suffers from prohibitively low throughput for complex designs. This paper proposes a multi-level co-optimization methodology that systematically reduces four major computational overhead sources—across supernode, node, and bit-level abstractions—to build GSIM, an efficient open-source RTL simulator. Our approach innovatively integrates supernode-level scheduling, fine-grained event management, and bit-level computation optimization, while重构 the simulation engine via combined static analysis and runtime adaptive strategies. Experimental evaluation demonstrates that GSIM successfully simulates Linux boot on the XiangShan processor, achieving 7.34× speedup over Verilator; on the Rocket core, it attains 19.94× acceleration in CoreMark benchmarking. These results represent an order-of-magnitude performance improvement for large-scale RISC-V designs, establishing GSIM as a scalable, high-throughput RTL simulation framework.
📝 Abstract
Register Transfer Level (RTL) simulation is widely used in design space exploration, verification, debugging, and preliminary performance evaluation for hardware design. Among various RTL simulation approaches, software simulation is the most commonly used due to its flexibility, low cost, and ease of debugging. However, the slow simulation of complex designs has become the bottleneck in design flow. In this work, we explore the sources of computation overhead of RTL simulation and conclude them into four factors. To optimize these factors, we propose several techniques at the supernode level, node level, and bit level. Finally, we implement these techniques in a novel RTL simulator GSIM. GSIM succeeds in simulating XiangShan, the state-of-the-art open-source RISC-V processor. Besides, compared to Verilator, GSIM can achieve speedup of 7.34x for booting Linux on XiangShan, and 19.94x for running CoreMark on Rocket.