Periodic Regularized Q-Learning

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of convergence guarantees in Q-learning with linear function approximation. The authors propose Projected Regularized Q-learning (PRQ), which introduces a periodic regularization mechanism at the level of the projection operator, transforming the iterative update into a contraction mapping and thereby ensuring algorithmic stability. Building upon this framework, they develop a sampling-based reinforcement learning algorithm and, by leveraging stochastic approximation theory, establish the first finite-time convergence guarantee for Q-learning under linear approximation. The key innovation lies in the integration of regularized projection operators with contraction mapping theory, effectively resolving the long-standing challenge of convergence in stochastic environments that plagues classical approaches.

Technology Category

Application Category

📝 Abstract
In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this limitation, a significant line of research has introduced regularization techniques to ensure stable convergence under function approximation. In this work, we propose a new algorithm, periodic regularized Q-learning (PRQ). We first introduce regularization at the level of the projection operator and explicitly construct a regularized projected value iteration (RP-VI), subsequently extending it to a sample-based RL algorithm. By appropriately regularizing the projection operator, the resulting projected value iteration becomes a contraction. By extending this regularized projection into the stochastic setting, we establish the PRQ algorithm and provide a rigorous theoretical analysis that proves finite-time convergence guarantees for PRQ under linear function approximation.
Problem

Research questions and friction points this paper is trying to address.

Q-learning
linear function approximation
convergence
reinforcement learning
regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

regularized projection
contraction mapping
linear function approximation
finite-time convergence
Q-learning
🔎 Similar Papers
No similar papers found.
H
Hyukjun Yang
Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
H
Han-Dong Lim
Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Donghwan Lee
Donghwan Lee
KAIST
Decision makingcontroland optimization