🤖 AI Summary
This work addresses the performance limitations of conventional GPU-accelerated static timing analysis (STA), which suffers from severe intra-warp load imbalance due to the irregular structure of circuit graphs. To overcome this challenge, the authors propose Warp-STAR, the first approach to coordinate STA computations at the warp level. By introducing a warp-aware scheduling mechanism, Warp-STAR effectively eliminates load imbalance while seamlessly integrating differentiable timing analysis. The proposed method achieves a 2.4× speedup over the state-of-the-art GPU-STA implementation and delivers a 1.7× end-to-end acceleration in timing-driven global placement. Furthermore, it enables efficient gradient computation, laying a foundational framework for differentiable electronic design automation (EDA).
📝 Abstract
Static timing analysis (STA) is crucial for Electronic Design Automation (EDA) flows but remains a computational bottleneck. While existing GPU-based STA engines are faster than CPU, they suffer from inefficiencies, particularly intra-warp load imbalance caused by irregular circuit graphs. This paper introduces Warp-STAR, a novel GPU-accelerated STA engine that eliminates this imbalance by orchestrating parallel computations at the warp level. This approach achieves a 2.4X speedup over previous state-of-the-art (SoTA) GPU-based STA. When integrated into a timing-driven global placement framework, Warp-STAR delivers a 1.7X speedup over SoTA frameworks. The method also proves effective for differentiable gradient analysis with minimal overhead.