๐ค AI Summary
Conventional coarse-grained reconfigurable array (CGRA) design space exploration relies heavily on post-synthesis simulation, resulting in prohibitively low evaluation efficiency.
Method: This paper proposes a lightweight behavioral-level simulation framework that enables instantaneous pre-synthesis prediction of power consumption and latency for time-multiplexed CGRA cores. The framework integrates PE array topology, datapath architecture, and timing schedule characteristics to construct a parameterized joint powerโlatency estimation model.
Contribution/Results: It achieves, for the first time, early-stage cross-core and cross-configuration performance comparison without requiring place-and-route or gate-level simulation. Experimental evaluation demonstrates prediction errors below 8% and over 100ร speedup compared to conventional flows, significantly enhancing CGRA architectural exploration efficiency.
๐ Abstract
At the intersection between traditional CPU architectures and more specialized options such as FPGAs or ASICs lies the family of reconfigurable hardware architectures, termed Coarse-Grained Reconfigurable Arrays (CGRAs). CGRAs are composed of a 2-dimensional array of processing elements (PE), tightly integrated with each other, each capable of performing arithmetic and logic operations. The vast design space of CGRA implementations poses a challenge, which calls for fast exploration tools to prune it in advance of time-consuming syntheses. The proposed tool aims to simplify this process by simulating kernel execution and providing a characterization framework. The estimator returns energy and latency values otherwise only available through a time-consuming post-synthesis simulation, allowing for instantaneous comparative analysis between different kernels and hardware configurations.