🤖 AI Summary
Conventional architectures face energy-efficiency bottlenecks due to data movement, while RRAM-based in-memory computing (IMC) suffers from device non-idealities and scalability limitations. To address these challenges, this paper proposes MELISO+, a full-stack distributed IMC framework. Leveraging algorithm-hardware co-design, it introduces a two-level error correction mechanism that suppresses first- and second-order arithmetic errors by over 90%. Its scalable distributed architecture supports matrix operations up to 65,000 × 65,000 dimensions. MELISO+ is tailored for high-dimensional workloads—including large language models and generative AI—achieving superior accuracy, energy efficiency, and latency on low-precision RRAM devices. Compared to conventional von Neumann architectures, it delivers a 3–5×10³ improvement in energy efficiency and a 100× reduction in latency.
📝 Abstract
Exponential growth in global computing demand is exacerbated due to the higher-energy requirements of conventional architectures, primarily due to energy-intensive data movement. In-memory computing with Resistive Random Access Memory (RRAM) addresses this by co-integrating memory and processing, but faces significant hurdles related to device-level non-idealities and poor scalability for large computing tasks. Here, we introduce extbf{MELISO+} (In- extbf{Me}mory extbf{Li}near extbf{So}lver), a full-stack, distributed framework for energy-efficient in-memory computing. MELISO+ proposes a novel two-tier error correction mechanism to mitigate device non-idealities and develops a distributed RRAM computing framework to enable matrix computations exceeding dimensions of $65,000 imes 65,000$. This approach reduces first- and second-order arithmetic errors due to device non-idealities by over 90%, enhances energy efficiency by three to five orders of magnitude, and decreases latency 100-fold. Hence, MELISO+ allows lower-precision RRAM devices to outperform high-precision device alternatives in accuracy, energy and latency metrics. By unifying algorithm-hardware co-design with scalable architecture, MELISO+ significantly advances sustainable, high-dimensional computing suitable for applications like large language models and generative AI.