A Unified Framework for Mapping and Synthesis of Approximate R-Blocks CGRAs

๐Ÿ“… 2025-05-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the dual demands of low power consumption and high performance for edge AI, this paper proposes the first end-to-end design framework for coarse-grained reconfigurable arrays (CGRAs) supporting approximate computing. Targeting the R-Block architecture, we introduce a novel output-feature-channelโ€“aware mapping strategy that enables transparent integration of approximate multipliers and jointly optimizes accuracy and energy efficiency. Furthermore, we propose a voltage island partitioning method leveraging approximate units, achieving substantial power reduction with minimal area overhead. Evaluated on MobileNetV2/ILSVRC-2012, our framework reduces power by 30% on average, incurs only +2% area overhead, attains 440 GOPS/W energy efficiency, and introduces negligible inference accuracy loss. The design outperforms state-of-the-art CGRA architectures in overall system efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
The ever-increasing complexity and operational diversity of modern Neural Networks (NNs) have caused the need for low-power and, at the same time, high-performance edge devices for AI applications. Coarse Grained Reconfigurable Architectures (CGRAs) form a promising design paradigm to address these challenges, delivering a close-to-ASIC performance while allowing for hardware programmability. In this paper, we introduce a novel end-to-end exploration and synthesis framework for approximate CGRA processors that enables transparent and optimized integration and mapping of state-of-the-art approximate multiplication components into CGRAs. Our methodology introduces a per-channel exploration strategy that maps specific output features onto approximate components based on accuracy degradation constraints. This enables the optimization of the system's energy consumption while retaining the accuracy above a certain threshold. At the circuit level, the integration of approximate components enables the creation of voltage islands that operate at reduced voltage levels, which is attributed to their inherently shorter critical paths. This key enabler allows us to effectively reduce the overall power consumption by an average of 30% across our analyzed architectures, compared to their baseline counterparts, while incurring only a minimal 2% area overhead. The proposed methodology was evaluated on a widely used NN model, MobileNetV2, on the ImageNet dataset, demonstrating that the generated architectures can deliver up to 440 GOPS/W with relatively small output error during inference, outperforming several State-of-the-Art CGRA architectures in terms of throughput and energy efficiency.
Problem

Research questions and friction points this paper is trying to address.

Develops framework for energy-efficient approximate CGRA processors
Optimizes accuracy and power via approximate multiplication components
Enables voltage islands for 30% power reduction with minimal area overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end exploration and synthesis framework
Per-channel approximate component mapping strategy
Voltage islands for reduced power consumption
๐Ÿ”Ž Similar Papers
No similar papers found.