๐ค AI Summary
Neural networks in reinforcement learning suffer from poor generalization, lack of convergence guarantees, and limited interpretability when approximating action-value functions. Method: This paper proposes Higher-order Regularization (HR), a novel regularization framework that explicitly models regularization as inverse-map approximation with provable error bounds. Theoretically, HR subsumes โโ-regularization as a first-order special case and reveals regularizationโs fundamental nature as a contraction mapping, yielding necessary and sufficient conditions for optimal generalization. The method encompasses a unified HR theoretical framework, rigorous upper and lower bounds on inverse-map approximation error, HR-enhanced Extreme Learning Machines (ELM), and an incremental HR algorithm. Contribution/Results: Empirical evaluation on classical control benchmarks demonstrates that HR significantly improves generalization capability and training stability while enhancing model interpretability. HR establishes a new paradigm for interpretable, provably convergent reinforcement learning.
๐ Abstract
The paper proposes a novel regularization procedure for machine learning. The proposed high-order regularization (HR) provides new insight into regularization, which is widely used to train a neural network that can be utilized to approximate the action-value function in general reinforcement learning problems. The proposed HR method ensures the provable convergence of the approximation algorithm, which makes the much-needed connection between regularization and explainable learning using neural networks. The proposed HR method theoretically demonstrates that regularization can be regarded as an approximation in terms of inverse mapping with explicitly calculable approximation error, and the $L_2$ regularization is a lower-order case of the proposed method. We provide lower and upper bounds for the error of the proposed HR solution, which helps build a reliable model. We also find that regularization with the proposed HR can be regarded as a contraction. We prove that the generalizability of neural networks can be maximized with a proper regularization matrix, and the proposed HR is applicable for neural networks with any mapping matrix. With the theoretical explanation of the extreme learning machine for neural network training and the proposed high-order regularization, one can better interpret the output of the neural network, thus leading to explainable learning. We present a case study based on regularized extreme learning neural networks to demonstrate the application of the proposed HR and give the corresponding incremental HR solution. We verify the performance of the proposed HR method by solving a classic control problem in reinforcement learning. The result demonstrates the superior performance of the method with significant enhancement in the generalizability of the neural network.