🤖 AI Summary
Second-order optimization is rarely used in large-scale neural network training due to prohibitive computational overhead—especially Hessian matrix inversion. This paper proposes the first RRAM-based analog in-memory computing accelerator for second-order optimization, enabling hardware-level, single-step Hessian inversion and thereby overcoming the computational throughput and energy-efficiency bottlenecks of conventional digital implementations. Our method co-designs RRAM crossbar arrays’ analog compute capabilities with second-order optimization algorithms to achieve efficient and numerically stable training directly in hardware. Experiments demonstrate that, on handwritten digit classification, our accelerator reduces training epochs by 26% and 61% compared to SGD and Adam, respectively; on larger-scale tasks, it achieves 5.88× higher throughput and 6.9× better energy efficiency. This work establishes a scalable, in-memory intelligent computing paradigm for efficient large-model training.
📝 Abstract
Second-order optimization methods, which leverage curvature information, offer faster and more stable convergence than first-order methods such as stochastic gradient descent (SGD) and Adam. However, their practical adoption is hindered by the prohibitively high cost of inverting the second-order information matrix, particularly in large-scale neural network training. Here, we present the first demonstration of a second-order optimizer powered by in-memory analog matrix computing (AMC) using resistive random-access memory (RRAM), which performs matrix inversion (INV) in a single step. We validate the optimizer by training a two-layer convolutional neural network (CNN) for handwritten letter classification, achieving 26% and 61% fewer training epochs than SGD with momentum and Adam, respectively. On a larger task using the same second-order method, our system delivers a 5.88x improvement in throughput and a 6.9x gain in energy efficiency compared to state-of-the-art digital processors. These results demonstrate the feasibility and effectiveness of AMC circuits for second-order neural network training, opening a new path toward energy-efficient AI acceleration.