🤖 AI Summary
To address the slow convergence and poor stability inherent in hyperparameter tuning during reinforcement learning (RL) training, this paper introduces HyperController—the first real-time controller that models dynamic hyperparameter adaptation as an unknown linear Gaussian dynamical system. Leveraging Kalman filtering, HyperController enables efficient, online, first-order optimal hyperparameter estimation and update without requiring gradient computations or policy retraining. The method integrates seamlessly into mainstream RL frameworks (e.g., Gymnasium) and supports end-to-end training. Evaluated across five standard benchmark environments, HyperController achieves the highest median evaluation reward in four of them, reduces average training time by 37%, and decreases convergence variance by 52%. These results demonstrate substantial improvements in both policy performance stability and deployment efficiency.
📝 Abstract
We introduce Hyperparameter Controller (HyperController), a computationally efficient algorithm for hyperparameter optimization during training of reinforcement learning neural networks. HyperController optimizes hyperparameters quickly while also maintaining improvement of the reinforcement learning neural network, resulting in faster training and deployment. It achieves this by modeling the hyperparameter optimization problem as an unknown Linear Gaussian Dynamical System, which is a system with a state that linearly changes. It then learns an efficient representation of the hyperparameter objective function using the Kalman filter, which is the optimal one-step predictor for a Linear Gaussian Dynamical System. To demonstrate the performance of HyperController, it is applied as a hyperparameter optimizer during training of reinforcement learning neural networks on a variety of OpenAI Gymnasium environments. In four out of the five Gymnasium environments, HyperController achieves highest median reward during evaluation compared to other algorithms. The results exhibit the potential of HyperController for efficient and stable training of reinforcement learning neural networks.