🤖 AI Summary
Traditional static timing analysis (STA) suffers from low computational efficiency and poor utilization of heterogeneous hardware in large-scale industrial designs. To address this, we propose HeteroSTA—the first end-to-end CPU-GPU co-execution STA engine. HeteroSTA natively supports multi-precision delay modeling, full SDC constraint parsing, and multi-clock-domain analysis. It employs GPU-accelerated graph traversal and path-based timing analysis algorithms, enabling fully GPU-accelerated end-to-end STA. A zero-overhead flattened API unifies graph-, path-, and timing-query interfaces, while dual deployment modes—shared library and standalone binary—are provided. Experimental evaluation demonstrates significant speedups over baseline tools in standalone mode, within DREAMPlace 4.0, and in timing-driven routing, achieving performance competitive with industrial-grade STA tools. The source code is publicly released to facilitate both academic research and industrial integration.
📝 Abstract
We introduce in this paper, HeteroSTA, the first CPU-GPU heterogeneous timing analysis engine that efficiently supports: (1) a set of delay calculation models providing versatile accuracy-speed choices without relying on an external golden tool, (2) robust support for industry formats, including especially the .sdc constraints containing all common timing exceptions, clock domains, and case analysis modes, and (3) end-to-end GPU-acceleration for both graph-based and path-based timing queries, all exposed as a zero-overhead flattened heterogeneous application programming interface (API). HeteroSTA is publicly available with both a standalone binary executable and an embeddable shared library targeting ubiquitous academic and industry applications. Example use cases as a standalone tool, a timing-driven DREAMPlace 4.0 integration, and a timing-driven global routing integration have all demonstrated remarkable runtime speed-up and comparable quality.