CIM-Tuner: Balancing the Compute and Storage Capacity of SRAM-CIM Accelerator via Hardware-mapping Co-exploration

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance limitations of SRAM-based compute-in-memory (CIM) accelerators, which stem from architectural heterogeneity and inadequate mapping strategies that hinder the balance between computation and memory capabilities. To overcome this challenge, the authors propose a hardware-mapping co-optimization methodology that automatically searches for the optimal hardware configuration and dataflow mapping under a fixed area budget. By introducing a generic CIM macro-array abstraction and an accelerator template, the approach enables cross-architecture adaptability. It further expands the optimization space through fine-grained two-level mapping, accelerator-level scheduling, and macro-level tiling techniques. Experimental results demonstrate that, under the same area constraint, the proposed solution achieves 1.58× higher energy efficiency and 2.11× greater throughput compared to state-of-the-art CIM mapping approaches and outperforms existing advanced CIM accelerators.

Technology Category

Application Category

📝 Abstract
As an emerging type of AI computing accelerator, SRAM Computing-In-Memory (CIM) accelerators feature high energy efficiency and throughput. However, various CIM designs and under-explored mapping strategies impede the full exploration of compute and storage balancing in SRAM-CIM accelerator, potentially leading to significant performance degradation. To address this issue, we propose CIM-Tuner, an automatic tool for hardware balancing and optimal mapping strategy under area constraint via hardware-mapping co-exploration. It ensures universality across various CIM designs through a matrix abstraction of CIM macros and a generalized accelerator template. For efficient mapping with different hardware configurations, it employs fine-grained two-level strategies comprising accelerator-level scheduling and macro-level tiling. Compared to prior CIM mapping, CIM-Tuner's extended strategy space achieves 1.58$\times$ higher energy efficiency and 2.11$\times$ higher throughput. Applied to SOTA CIM accelerators with identical area budget, CIM-Tuner also delivers comparable improvements. The simulation accuracy is silicon-verified and CIM-Tuner tool is open-sourced at https://github.com/champloo2878/CIM-Tuner.git.
Problem

Research questions and friction points this paper is trying to address.

SRAM-CIM
compute-storage balance
mapping strategy
hardware acceleration
performance degradation
Innovation

Methods, ideas, or system contributions that make the work stand out.

SRAM-CIM
hardware-mapping co-exploration
compute-storage balancing
two-level mapping
accelerator optimization
🔎 Similar Papers
No similar papers found.
J
Jinwu Chen
School of Integrated Circuits, Southeast University, China; National Center of Technology Innovation for EDA, China
Yuhui Shi
Yuhui Shi
Chair Professor, Computer Science and Engineering, Southern University of Science and Technology
Evolutionary ComputationSwarm IntelligenceParticle Swarm Optimization AlgorithmBrain Storm Optimization
H
He Wang
School of Integrated Circuits, Southeast University, China
Zhe Jiang
Zhe Jiang
Southeast University, People's Republic of China.
(Micro-)ArchitectureEmbedded SystemDesign AutomationSafetyReal-time
J
Jun Yang
School of Integrated Circuits, Southeast University, China; National Center of Technology Innovation for EDA, China
Xin Si
Xin Si
Southeast University
MemoryComputation in memoryAI processorAnalog/mixed signal circuit
Z
Zhenhua Zhu
Department of Electronic Engineering, Tsinghua University, China