Row-Column Hybrid Grouping for Fault-Resilient Multi-Bit Weight Representation on IMC Arrays

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address computational unreliability in analog in-memory computing (IMC) systems caused by stuck-at faults (SAFs) and the high compilation overhead and poor scalability of existing fault-tolerant compilation algorithms (e.g., Fault-Free), this work proposes a row-column hybrid grouping multi-bit weight mapping method. We introduce a novel bidirectional redundancy architecture to enhance robustness against SAFs and formulate fault-tolerant weight decomposition as an integer linear programming (ILP) problem, solved efficiently via theoretical optimization. Evaluated on convolutional neural networks and small language models, our approach achieves up to an 8-percentage-point accuracy improvement, 150× faster compilation, and 2× higher energy efficiency compared to state-of-the-art baselines. This work significantly advances the reliability, deployment efficiency, and energy-delay-product of analog IMC accelerators.

Technology Category

Application Category

📝 Abstract
This paper addresses two critical challenges in analog In-Memory Computing (IMC) systems that limit their scalability and deployability: the computational unreliability caused by stuck-at faults (SAFs) and the high compilation overhead of existing fault-mitigation algorithms, namely Fault-Free (FF). To overcome these limitations, we first propose a novel multi-bit weight representation technique, termed row-column hybrid grouping, which generalizes conventional column grouping by introducing redundancy across both rows and columns. This structural redundancy enhances fault tolerance and can be effectively combined with existing fault-mitigation solutions. Second, we design a compiler pipeline that reformulates the fault-aware weight decomposition problem as an Integer Linear Programming (ILP) task, enabling fast and scalable compilation through off-the-shelf solvers. Further acceleration is achieved through theoretical insights that identify fault patterns amenable to trivial solutions, significantly reducing computation. Experimental results on convolutional networks and small language models demonstrate the effectiveness of our approach, achieving up to 8%p improvement in accuracy, 150x faster compilation, and 2x energy efficiency gain compared to existing baselines.
Problem

Research questions and friction points this paper is trying to address.

Enhancing fault tolerance in analog IMC systems
Reducing compilation overhead for fault mitigation
Improving multi-bit weight representation reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Row-column hybrid grouping for multi-bit weights
ILP-based compiler pipeline for fast decomposition
Fault pattern analysis for trivial solution acceleration
🔎 Similar Papers
No similar papers found.
K
Kang Eun Jeon
Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea; Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Seongnam, Korea
S
Sangheum Yeon
Department of Semiconductor Engineering, Sungkyunkwan University, Suwon, Korea
J
Jinhee Kim
Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea; Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
H
Hyeonsu Bang
Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea
J
Johnny Rhe
Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea
Jong Hwan Ko
Jong Hwan Ko
SungKyunKwan Univ. (SKKU)
Deep learning acceleratorImage/audio processingVLSI/IoT systems design