🤖 AI Summary
Existing reinforcement learning methods fail to effectively exploit inherent group symmetries in environments, leading to suboptimal sample efficiency.
Method: We propose the first symmetry-aware optimistic least-squares value iteration (LSVI) framework. It incorporates symmetry priors by constructing invariant kernel functions under group actions, embedding them into a reproducing kernel Hilbert space (RKHS), and integrating optimistic rewards into value iteration to ensure efficient exploration.
Contribution/Results: We provide the first theoretical quantification of information gain and reduction in covering numbers induced by symmetry, rigorously proving that the sample complexity decreases inversely with the order of the symmetry group. Empirical evaluation on customized Frozen Lake and 2D layout tasks demonstrates substantial improvements over standard kernel-based LSVI, validating the significant boost in sample efficiency afforded by structural priors.
📝 Abstract
In many real-world reinforcement learning (RL) problems, the environment exhibits inherent symmetries that can be exploited to improve learning efficiency. This paper develops a theoretical and algorithmic framework for incorporating known group symmetries into kernel-based RL. We propose a symmetry-aware variant of optimistic least-squares value iteration (LSVI), which leverages invariant kernels to encode invariance in both rewards and transition dynamics. Our analysis establishes new bounds on the maximum information gain and covering numbers for invariant RKHSs, explicitly quantifying the sample efficiency gains from symmetry. Empirical results on a customized Frozen Lake environment and a 2D placement design problem confirm the theoretical improvements, demonstrating that symmetry-aware RL achieves significantly better performance than their standard kernel counterparts. These findings highlight the value of structural priors in designing more sample-efficient reinforcement learning algorithms.