🤖 AI Summary
Fusion energy confinement is fundamentally limited by plasma turbulence, whose accurate simulation requires computationally prohibitive 5D nonlinear gyrokinetic equations. While existing reduced-order models offer efficiency, they neglect essential nonlinear physics. This paper introduces GyroSwin—the first scalable 5D neural surrogate model for gyrokinetics—extending hierarchical Vision Transformers to five dimensions for the first time. It innovatively incorporates cross-modal attention, implicit 3D↔5D interaction mechanisms, and physics-informed channel-wise mode decomposition to enable verifiable, full-gyrokinetic modeling. Experiments demonstrate that GyroSwin achieves higher heat flux prediction accuracy than conventional numerical solvers, faithfully reproduces turbulent energy cascades, reduces computational cost by three orders of magnitude, and scales to billion-parameter regimes. The model thus delivers unprecedented fidelity, efficiency, and physical consistency in plasma turbulence modeling.
📝 Abstract
Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport.GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D$leftrightarrow$5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.