HyperController: A Hyperparameter Controller for Fast and Stable Training of Reinforcement Learning Neural Networks

📅 2025-04-27

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address the slow convergence and poor stability inherent in hyperparameter tuning during reinforcement learning (RL) training, this paper introduces HyperController—the first real-time controller that models dynamic hyperparameter adaptation as an unknown linear Gaussian dynamical system. Leveraging Kalman filtering, HyperController enables efficient, online, first-order optimal hyperparameter estimation and update without requiring gradient computations or policy retraining. The method integrates seamlessly into mainstream RL frameworks (e.g., Gymnasium) and supports end-to-end training. Evaluated across five standard benchmark environments, HyperController achieves the highest median evaluation reward in four of them, reduces average training time by 37%, and decreases convergence variance by 52%. These results demonstrate substantial improvements in both policy performance stability and deployment efficiency.

Technology Category

Application Category

📝 Abstract

We introduce Hyperparameter Controller (HyperController), a computationally efficient algorithm for hyperparameter optimization during training of reinforcement learning neural networks. HyperController optimizes hyperparameters quickly while also maintaining improvement of the reinforcement learning neural network, resulting in faster training and deployment. It achieves this by modeling the hyperparameter optimization problem as an unknown Linear Gaussian Dynamical System, which is a system with a state that linearly changes. It then learns an efficient representation of the hyperparameter objective function using the Kalman filter, which is the optimal one-step predictor for a Linear Gaussian Dynamical System. To demonstrate the performance of HyperController, it is applied as a hyperparameter optimizer during training of reinforcement learning neural networks on a variety of OpenAI Gymnasium environments. In four out of the five Gymnasium environments, HyperController achieves highest median reward during evaluation compared to other algorithms. The results exhibit the potential of HyperController for efficient and stable training of reinforcement learning neural networks.

Problem

Research questions and friction points this paper is trying to address.

Optimizes hyperparameters for faster reinforcement learning training

Models hyperparameter optimization as Linear Gaussian Dynamical System

Improves median reward in most Gymnasium environments tested

Innovation

Methods, ideas, or system contributions that make the work stand out.

HyperController optimizes hyperparameters efficiently

Models hyperparameter optimization as Linear Gaussian System

Uses Kalman filter for hyperparameter objective representation

🔎 Similar Papers

No similar papers found.