🤖 AI Summary
Addressing the challenges of instability control in complex dynamical systems—particularly chaotic ones—high-fidelity simulation costs, and low modeling accuracy under sparse data, this paper proposes a multi-fidelity reinforcement learning (RL) framework. Methodologically, it introduces a physics-informed, differentiable hybrid model that jointly integrates low-fidelity simulations with sparse high-fidelity observations; designs a spectral-analysis-based reward function to explicitly guide control policies toward target statistical properties; and enables end-to-end differentiable co-optimization. The key contribution is the first integration of spectral-aware rewards and differentiable multi-fidelity modeling into deep RL for control. Evaluated on two canonical chaotic systems, the approach achieves statistical control performance equivalent to that of full high-fidelity environments—using less than 10% of the high-fidelity query cost—and significantly outperforms existing state-of-the-art methods.
📝 Abstract
Controlling instabilities in complex dynamical systems is challenging in scientific and engineering applications. Deep reinforcement learning (DRL) has seen promising results for applications in different scientific applications. The many-query nature of control tasks requires multiple interactions with real environments of the underlying physics. However, it is usually sparse to collect from the experiments or expensive to simulate for complex dynamics. Alternatively, controlling surrogate modeling could mitigate the computational cost issue. However, a fast and accurate learning-based model by offline training makes it very hard to get accurate pointwise dynamics when the dynamics are chaotic. To bridge this gap, the current work proposes a multi-fidelity reinforcement learning (MFRL) framework that leverages differentiable hybrid models for control tasks, where a physics-based hybrid model is corrected by limited high-fidelity data. We also proposed a spectrum-based reward function for RL learning. The effect of the proposed framework is demonstrated on two complex dynamics in physics. The statistics of the MFRL control result match that computed from many-query evaluations of the high-fidelity environments and outperform other SOTA baselines.