Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

📅 2025-12-22

📈 Citations: 0

✨ Influential: 0

career value

243K/year

🤖 AI Summary

To address the trade-off between energy efficiency and accuracy in conventional quantization methods for ReRAM-based compute-in-memory (CIM) architectures, this paper proposes a sensitivity-guided structured mixed-precision quantization paradigm. The method performs weight sensitivity analysis to dynamically allocate bit-widths across layers and channels, while jointly optimizing ReRAM crossbar array mapping to maximize hardware utilization. Compared to fixed-precision baselines, our approach achieves 86.33% accuracy under 70% model compression, reduces power consumption by 40%, and concurrently lowers latency and computational load. The key innovation lies in the first unified modeling of sensitivity-driven quantization, structured mixed-precision assignment, and crossbar mapping—enabling co-optimization of compression ratio, energy efficiency, and accuracy in ReRAM CIM systems.

Technology Category

Application Category

📝 Abstract

Compute-In-Memory (CIM) systems, particularly those utilizing ReRAM and memristive technologies, offer a promising path toward energy-efficient neural network computation. However, conventional quantization and compression techniques often fail to fully optimize performance and efficiency in these architectures. In this work, we present a structured quantization method that combines sensitivity analysis with mixed-precision strategies to enhance weight storage and computational performance on ReRAM-based CIM systems. Our approach improves ReRAM Crossbar utilization, significantly reducing power consumption, latency, and computational load, while maintaining high accuracy. Experimental results show 86.33% accuracy at 70% compression, alongside a 40% reduction in power consumption, demonstrating the method's effectiveness for power-constrained applications.

Problem

Research questions and friction points this paper is trying to address.

Optimizes quantization for ReRAM-based computing-in-memory systems

Enhances weight storage and computational performance via sensitivity analysis

Reduces power consumption and latency while maintaining high accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sensitivity-aware mixed-precision quantization for ReRAM CIM

Structured method combining sensitivity analysis with mixed-precision

Improves crossbar utilization reducing power and latency

🔎 Similar Papers

No similar papers found.