Offline Reinforcement Learning for Rotation Profile Control in Tokamaks

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenges of controlling tokamak plasma rotation profiles, which are characterized by high dimensionality, strong coupling among multiple actuators, and pronounced state dependence. The study presents the first integration of offline reinforcement learning with a probabilistic dynamics model trained exclusively on historical data from the DIII-D tokamak, eliminating the need for high-fidelity simulators. By leveraging the probabilistic model to generate synthetic trajectories for policy optimization, the proposed approach successfully learns a multi-input multi-output control strategy directly from archival operational data. Experimental validation on the actual tokamak demonstrates effective regulation of the rotation profile, establishing the feasibility of controlling complex physical systems using only limited historical datasets.

📝 Abstract

Tokamaks remain leading candidates for achieving practical fusion energy, yet many important control problems inside these devices are still difficult or unsolved. One such challenge is controlling the plasma rotation profile, which strongly influences stability, confinement, and transport. While the average rotation can be controlled, controlling the full profile is challenging due to high dimensionality, response to multiple actuators and dependence on plasma condition. Learning-based control methods, such as reinforcement learning (RL), provide a potential solution to this challenging problem with ability to model complex interactions leading to effective multi-input multi-output control. However, learning such policies is challenging due to the lack of accurate simulators that can model the rotation profile dynamics. In this work, we investigate the use of offline RL and offline model-based RL algorithms for rotation profile control, training them solely on historical data from the DIII-D tokamak. Our final method uses probabilistic models of plasma dynamics to generate rollouts for RL training. We deploy this policy on the DIII-D Tokamak and observe promising real-world results. We conclude by highlighting key challenges and insights from training and deploying an RL policy on a complex physical device while using only limited past data.

Problem

Research questions and friction points this paper is trying to address.

rotation profile control

tokamak

offline reinforcement learning

plasma control

multi-input multi-output control

Innovation

Methods, ideas, or system contributions that make the work stand out.

offline reinforcement learning

rotation profile control

probabilistic dynamics model

tokamak

model-based RL

🔎 Similar Papers

No similar papers found.