🤖 AI Summary
Thermal modeling for 2.5D/3D heterogeneous multi-die architectures faces a fundamental trade-off between accuracy and computational efficiency. Method: This paper proposes the first hierarchical, multi-fidelity thermal modeling framework tailored for die-integrated systems. It synergistically integrates reduced-order modeling (ROM), graph neural network–driven spatial feature extraction, physics-constrained fidelity-aware interpolation, and an adaptive switching mechanism to enable seamless, cross-scale (architecture-to-package) and cross-phase modeling. Results: Evaluated on 2.5D systems with 16/36/64 dies and a 3D system with 16×3 dies, the framework reduces thermal simulation time from days to milliseconds—achieving >10⁵× speedup—while maintaining a mean absolute error of only 1.2% (<1.5%). This enables efficient design-space exploration and real-time thermal management in complex heterogeneous integration scenarios.
📝 Abstract
Rapidly evolving artificial intelligence and machine learning applications require ever-increasing computational capabilities, while monolithic 2D design technologies approach their limits. Heterogeneous integration of smaller chiplets using a 2.5D silicon interposer and 3D packaging has emerged as a promising paradigm to address this limit and meet performance demands. These approaches offer a significant cost reduction and higher manufacturing yield than monolithic 2D integrated circuits. However, the compact arrangement and high compute density exacerbate the thermal management challenges, potentially compromising performance. Addressing these thermal modeling challenges is critical, especially as system sizes grow and different design stages require varying levels of accuracy and speed. Since no single thermal modeling technique meets all these needs, this paper introduces MFIT, a range of multi-fidelity thermal models that effectively balance accuracy and speed. These multi-fidelity models can enable efficient design space exploration and runtime thermal management. Our extensive testing on systems with 16, 36, and 64 2.5D integrated chiplets and 16x3 3D integrated chiplets demonstrates that these models can reduce execution times from days to mere seconds and milliseconds with negligible loss in accuracy.