Mixed-Precision Training and Compilation for RRAM-based Computing-in-Memory Accelerators

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This work addresses the limitations of RRAM-based compute-in-memory (CIM) accelerators, which suffer from low input and memory bitwidths and lack compiler support for sub-8-bit quantization, resulting in multi-cycle matrix-vector multiplication and inefficient weight storage. To overcome these challenges, the authors propose the first CIM-oriented mixed-precision training and compilation framework that leverages reinforcement learning to automatically explore quantization configurations balancing latency and accuracy within a vast search space. By moving beyond conventional fixed-precision paradigms, the approach achieves up to 2.48× speedup with only a 0.086% accuracy loss, significantly enhancing the energy efficiency and throughput of CIM accelerators.

Technology Category

Application Category

📝 Abstract

Computing-in-Memory (CIM) accelerators are a promising solution for accelerating Machine Learning (ML) workloads, as they perform Matrix-Vector Multiplications (MVMs) on crossbar arrays directly in memory. Although the bit widths of the crossbar inputs and cells are very limited, most CIM compilers do not support quantization below 8 bit. As a result, a single MVM requires many compute cycles, and weights cannot be efficiently stored in a single crossbar cell. To address this problem, we propose a mixed-precision training and compilation framework for CIM architectures. The biggest challenge is the massive search space, that makes it difficult to find good quantization parameters. This is why we introduce a reinforcement learning-based strategy to find suitable quantization configurations that balance latency and accuracy. In the best case, our approach achieves up to a 2.48x speedup over existing state-of-the-art solutions, with an accuracy loss of only 0.086 %.

Problem

Research questions and friction points this paper is trying to address.

Mixed-Precision

Computing-in-Memory

Quantization

RRAM

Matrix-Vector Multiplication

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed-Precision Training

Computing-in-Memory

Reinforcement Learning