Cooling Channel Design Optimization for High Power Multi-chip Packages

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

238K/year
🤖 AI Summary
This work addresses the severe thermal challenges in high-power heterogeneous multi-chip packages, such as NVIDIA’s GB200, by proposing a parameterizable interdigitated microchannel cooling architecture. A physically accurate thermofluidic coupling framework is established using a porous medium model combined with row-level coolant energy balance. The design innovatively incorporates a cooling coverage constraint tailored to high heat-flux GPU regions and leverages a surrogate model integrated with a mixed-integer quadratic programming (MIQP) algorithm to efficiently optimize channel geometric parameters. Evaluated on a GB200 multi-chip configuration, the proposed approach significantly reduces peak temperature by 140.45 °C and average temperature by 35.87 °C compared to the baseline design, delivering an advanced thermal management solution that balances accuracy and computational efficiency.
📝 Abstract
Thermal management is a major challenge in next-generation high-performance computing systems, particularly for heterogeneous multi-chip packages such as the NVIDIA GB200 Grace Blackwell Superchip. In this work, a physics-based computational framework is developed to optimize embedded cooling channel layouts for high-power multi-chip modules. The model couples steady-state heat conduction with a porous media-based representation of coolant transport, coupled with a row-wise coolant energy balance, to estimate chip temperature fields within microchannel networks. Unlike conventional designs, an interdigitated cooling architecture is parameterized using geometric variables, including channel count, width, and expansion over chip regions, enabling systematic design exploration. To enable efficient optimization, a surrogate-based approach is employed to approximate the relationship between geometric parameters and temperature metrics. The resulting model is optimized using a mixed-integer quadratic programming algorithm to minimize a weighted objective based on peak and average chip temperatures. To improve physical relevance, channel placement is further constrained to increase cooling coverage near GPU regions, where thermal loads are highest. The framework is applied to a representative multi-chip configuration based on NVIDIA GB200 architecture, consisting of two graphics processing units and one central processing unit. The results demonstrate that the optimal design reduces the peak chip temperature by 140.45°C and the average chip temperature by 35.87°C compared to the baseline configuration.
Problem

Research questions and friction points this paper is trying to address.

thermal management
cooling channel design
multi-chip packages
high-power computing
chip temperature
Innovation

Methods, ideas, or system contributions that make the work stand out.

cooling channel optimization
multi-chip package thermal management
interdigitated microchannel
surrogate-based optimization
porous media modeling
🔎 Similar Papers
No similar papers found.