A flexible framework for early power and timing comparison of time-multiplexed CGRA kernel executions

๐Ÿ“… 2025-04-02
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Conventional coarse-grained reconfigurable array (CGRA) design space exploration relies heavily on post-synthesis simulation, resulting in prohibitively low evaluation efficiency. Method: This paper proposes a lightweight behavioral-level simulation framework that enables instantaneous pre-synthesis prediction of power consumption and latency for time-multiplexed CGRA cores. The framework integrates PE array topology, datapath architecture, and timing schedule characteristics to construct a parameterized joint powerโ€“latency estimation model. Contribution/Results: It achieves, for the first time, early-stage cross-core and cross-configuration performance comparison without requiring place-and-route or gate-level simulation. Experimental evaluation demonstrates prediction errors below 8% and over 100ร— speedup compared to conventional flows, significantly enhancing CGRA architectural exploration efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
At the intersection between traditional CPU architectures and more specialized options such as FPGAs or ASICs lies the family of reconfigurable hardware architectures, termed Coarse-Grained Reconfigurable Arrays (CGRAs). CGRAs are composed of a 2-dimensional array of processing elements (PE), tightly integrated with each other, each capable of performing arithmetic and logic operations. The vast design space of CGRA implementations poses a challenge, which calls for fast exploration tools to prune it in advance of time-consuming syntheses. The proposed tool aims to simplify this process by simulating kernel execution and providing a characterization framework. The estimator returns energy and latency values otherwise only available through a time-consuming post-synthesis simulation, allowing for instantaneous comparative analysis between different kernels and hardware configurations.
Problem

Research questions and friction points this paper is trying to address.

Fast exploration of CGRA design space for energy and latency
Simulating kernel execution to avoid time-consuming synthesis
Instant comparison of different kernels and hardware configurations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flexible framework for CGRA power and timing
Simulates kernel execution for fast exploration
Estimates energy and latency without synthesis
๐Ÿ”Ž Similar Papers
No similar papers found.