Generative Design for Direct-to-Chip Liquid Cooling for Data Centers

📅 2026-04-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenge of thermal hotspots in heterogeneous chips caused by non-uniform power distribution, a problem inadequately mitigated by conventional liquid-cooling channel designs that rely heavily on empirical guidelines. To overcome this limitation, the authors propose a generative cooling channel design framework that, for the first time, integrates a lightweight physics-informed thermal model with a constrained reaction–diffusion mechanism. This approach automatically synthesizes non-uniform flow channel geometries tailored to the NVIDIA GB200 chip while respecting practical engineering constraints, and dynamically reallocates cooling capacity through a closed-loop iterative optimization process. Experimental results demonstrate that the proposed method reduces the average chip temperature by more than 5 °C and lowers peak temperatures by over 35 °C compared to traditional parallel-channel designs, substantially enhancing liquid-cooling efficiency.

Technology Category

Application Category

📝 Abstract
Rapid growth in artificial intelligence (AI) workloads is driving up data center power densities, increasing the need for advanced thermal management. Direct-to-chip liquid cooling can remove heat efficiently at the source, but many cold plate channel layouts remain heuristic and are not optimized for the strongly non-uniform temperature distribution of modern heterogeneous packages. This work presents a generative design framework for synthesizing cooling channel geometries for the NVIDIA GB200 Grace Blackwell Superchip. A physics-based finite-difference thermal model provides rapid steady-state temperature predictions and supplies spatial thermal feedback to a constrained reaction-diffusion process that generates novel channel topologies while enforcing inlet/outlet and component constraints. By iterating channel generation and thermal evaluation in a closed loop, the method naturally redistributes cooling capacity toward high-power regions and suppresses hot-spot formation. Compared with a baseline parallel channel design, the resulting channels achieve more than a 5 degree Celsius reduction in average temperature and over 35 degree Celsius reduction in maximum temperature. Overall, the results demonstrate that coupling generative algorithms with lightweight physics-based modeling can significantly enhance direct-to-chip liquid cooling performance, supporting more sustainable scaling of AI computing.
Problem

Research questions and friction points this paper is trying to address.

direct-to-chip liquid cooling
thermal management
non-uniform temperature distribution
heterogeneous packages
cold plate design
Innovation

Methods, ideas, or system contributions that make the work stand out.

generative design
direct-to-chip liquid cooling
reaction-diffusion
thermal management
heterogeneous packaging
🔎 Similar Papers
No similar papers found.