🤖 AI Summary
Manual parameter tuning in high-power femtosecond laser systems is inefficient, while conventional optimization methods rely on static assumptions and lack adaptability to dynamic experimental conditions. Method: This paper proposes an end-to-end real-time adaptive control framework based on deep reinforcement learning—specifically, a lightweight vision-policy network integrated with the Proximal Policy Optimization (PPO) algorithm—to directly learn pulse-shaping control policies from raw image observations. An online adaptive control mechanism enables generalization across diverse dynamical regimes. Contribution/Results: Experiments on multiple configurations of real-world laser systems demonstrate a 90% success rate in achieving target intensity profiles. The framework significantly enhances robustness against device drift, environmental disturbances, and system variations, overcoming the fundamental limitation of black-box optimization methods in dynamically adapting to high-energy physics experiments.
📝 Abstract
High Power Laser (HPL) systems operate in the femtosecond regime--one of the shortest timescales achievable in experimental physics. HPL systems are instrumental in high-energy physics, leveraging ultra-short impulse durations to yield extremely high intensities, which are essential for both practical applications and theoretical advancements in light-matter interactions. Traditionally, the parameters regulating HPL optical performance are tuned manually by human experts, or optimized by using black-box methods that can be computationally demanding. Critically, black box methods rely on stationarity assumptions overlooking complex dynamics in high-energy physics and day-to-day changes in real-world experimental settings, and thus need to be often restarted. Deep Reinforcement Learning (DRL) offers a promising alternative by enabling sequential decision making in non-static settings. This work investigates the safe application of DRL to HPL systems, and extends the current research by (1) learning a control policy directly from images and (2) addressing the need for generalization across diverse dynamics. We evaluate our method across various configurations and observe that DRL effectively enables cross-domain adaptability, coping with dynamics' fluctuations while achieving 90% of the target intensity in test environments.