MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high inter-chip communication overhead, severe congestion, and packaging heterogeneity limiting energy efficiency in multi-chip module (MCM) accelerators, this paper proposes an end-to-end, congestion-aware, and package-adaptive communication analysis and optimization framework. Methodologically, it innovatively integrates diagonal interconnect topology, on-die redistribution mechanisms, and non-uniform task partitioning to establish a hardware-software co-optimization paradigm. Furthermore, it introduces, for the first time, a hybrid solution combining genetic algorithm (GA) and mixed-integer quadratic programming (MIQP). Evaluated on CNN and Vision Transformer models, the framework achieves 1.58× and 2.7× improvements in energy-delay product (EdP) under GA and MIQP, respectively—significantly outperforming state-of-the-art approaches. Key contributions include: (i) a unified, packaging-aware communication modeling framework; (ii) a soft–hard co-design methodology leveraging topology, redistribution, and partitioning; and (iii) a novel GA–MIQP hybrid solver for scalable, high-quality optimization.

Technology Category

Application Category

📝 Abstract
Increasing AI computing demands and slowing transistor scaling have led to the advent of Multi-Chip-Module (MCMs) based accelerators. MCMs enable cost-effective scalability, higher yield, and modular reuse by partitioning large chips into smaller chiplets. However, MCMs come at an increased communication cost, which requires critical analysis and optimization. This paper makes three main contributions: (i) an end-to-end, off-chip congestion-aware and packaging-adaptive analytical framework for detailed analysis, (ii) hardware software co-optimization incorporating diagonal links, on-chip redistribution, and non-uniform workload partitioning to optimize the framework, and (iii) using metaheuristics (genetic algorithms, GA) and mixed integer quadratic programming (MIQP) to solve the optimized framework. Experimental results demonstrate significant performance improvements for CNNs and Vision Transformers, showcasing up to 1.58x and 2.7x EdP (Energy delay Product) improvement using GA and MIQP, respectively.
Problem

Research questions and friction points this paper is trying to address.

Optimizing communication costs in Multi-Chip-Module (MCM) accelerators
Developing congestion-aware analytical framework for MCM communication
Enhancing performance via hardware-software co-optimization techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Off-chip congestion-aware packaging-adaptive analytical framework
Hardware-software co-optimization with diagonal links
Metaheuristics and MIQP for framework optimization
🔎 Similar Papers
No similar papers found.