A Switching System Theory of Q-Learning with Linear Function Approximation

📅 2026-05-10
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the lack of systematic theoretical guarantees for the convergence of Q-learning with linear function approximation under stochastic observations. It introduces a novel perspective by modeling Q-learning as a linear switched system, characterizing its mean dynamics through the joint spectral radius (JSR). By integrating tools from stochastic approximation theory, the paper provides a unified convergence analysis that encompasses both i.i.d. and Markovian observation settings. The proposed framework establishes an equivalence between the stability of Q-learning and that of the associated switched system, yielding convergence criteria that are tighter than conventional single-step norm-based bounds. Furthermore, the approach naturally extends to regularized variants of Q-learning, offering a more general and refined theoretical foundation for understanding its asymptotic behavior.
📝 Abstract
This paper develops a switching-system interpretation of Q-learning with linear function approximation (LFA) based on the joint spectral radius (JSR). We derive an exact linear switched model for the mean dynamics and relate convergence to stability of the corresponding switched system. The same construction is then used for stochastic linear Q-learning with independent and identically distributed (i.i.d.) observations and with Markovian observations. Although exact JSR computation is difficult in general, the certificate captures products of switching modes and can be less conservative than one-step norm bounds. The framework also yields a JSR-based view of regularized Q-learning with LFA. The resulting analysis connects projected Bellman equations, finite-difference stochastic-policy switching, and switched-system stability in a single parameter-space formulation.
Problem

Research questions and friction points this paper is trying to address.

Q-learning
linear function approximation
convergence analysis
switched systems
joint spectral radius
Innovation

Methods, ideas, or system contributions that make the work stand out.

switching system
joint spectral radius
Q-learning
linear function approximation
convergence analysis
🔎 Similar Papers
2022-02-11arXiv.orgCitations: 9