Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization of existing in-memory computing accelerators, which are typically optimized for a single neural network. To overcome this limitation, we propose the first hardware-software co-optimization framework tailored for multi-workload scenarios. By employing an enhanced evolutionary algorithm, our approach jointly searches the RRAM/SRAM-based in-memory computing architecture and multi-model deployment strategies while explicitly modeling cross-task trade-offs. The resulting accelerator design achieves strong generality without sacrificing efficiency. Evaluated across four and nine concurrent workloads, our solution reduces the energy-delay-area product (EDAP) by 76.2% and 95.5% compared to baseline designs, respectively. These results significantly narrow the performance gap between general-purpose and application-specific accelerators, demonstrating exceptional robustness and adaptability across diverse workloads.

Technology Category

Application Category

📝 Abstract
Software-hardware co-design is essential for optimizing in-memory computing (IMC) hardware accelerators for neural networks. However, most existing optimization frameworks target a single workload, leading to highly specialized hardware designs that do not generalize well across models and applications. In contrast, practical deployment scenarios require a single IMC platform that can efficiently support multiple neural network workloads. This work presents a joint hardware-workload co-optimization framework based on an optimized evolutionary algorithm for designing generalized IMC accelerator architectures. By explicitly capturing cross-workload trade-offs rather than optimizing for a single model, the proposed approach significantly reduces the performance gap between workload-specific and generalized IMC designs. The framework is evaluated on both RRAM- and SRAM-based IMC architectures, demonstrating strong robustness and adaptability across diverse design scenarios. Compared to baseline methods, the optimized designs achieve energy-delay-area product (EDAP) reductions of up to 76.2% and 95.5% when optimizing across a small set (4 workloads) and a large set (9 workloads), respectively. The source code of the framework is available at https://github.com/OlgaKrestinskaya/JointHardwareWorkloadOptimizationIMC.
Problem

Research questions and friction points this paper is trying to address.

in-memory computing
hardware-workload co-optimization
generalized accelerator design
neural network workloads
cross-workload generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

in-memory computing
hardware-workload co-optimization
evolutionary algorithm
generalized accelerator design
energy-delay-area product
🔎 Similar Papers
No similar papers found.
O
Olga Krestinskaya
King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Mohammed E. Fouda
Mohammed E. Fouda
Unknown affiliation
Ahmed Eltawil
Ahmed Eltawil
Professor of ECE, Associate Dean of Research, King Abdullah University of Science and Technology
Wireless CommunicationsNext Generation NetworksBody Area NetworksInternet of Bodies
K
Khaled N. Salama
King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia