State Entropy Regularization for Robust Reinforcement Learning

📅 2025-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
State-entropy regularization in reinforcement learning lacks theoretical foundations, and existing robust RL methods struggle with structured or spatially correlated perturbations in transfer settings. Method: We establish the first theoretical guarantees for state-entropy regularization under reward and transition uncertainty, characterizing its robustness bounds; propose a unified framework based on rollout sensitivity modeling and formal uncertainty analysis; and systematically compare it with policy-entropy regularization to expose their fundamental differences in dependence on perturbation structure. Results: Experiments demonstrate that state-entropy regularization significantly enhances robustness against structured perturbations; performance degradation is jointly governed by perturbation magnitude and rollout horizon; and robustness gains critically depend on the number of rollouts used in policy evaluation. Our analysis bridges theoretical justification with empirical efficacy, revealing key design principles for entropy-based robust RL in transfer and uncertain environments.

Technology Category

Application Category

📝 Abstract
State entropy regularization has empirically shown better exploration and sample complexity in reinforcement learning (RL). However, its theoretical guarantees have not been studied. In this paper, we show that state entropy regularization improves robustness to structured and spatially correlated perturbations. These types of variation are common in transfer learning but often overlooked by standard robust RL methods, which typically focus on small, uncorrelated changes. We provide a comprehensive characterization of these robustness properties, including formal guarantees under reward and transition uncertainty, as well as settings where the method performs poorly. Much of our analysis contrasts state entropy with the widely used policy entropy regularization, highlighting their different benefits. Finally, from a practical standpoint, we illustrate that compared with policy entropy, the robustness advantages of state entropy are more sensitive to the number of rollouts used for policy evaluation.
Problem

Research questions and friction points this paper is trying to address.

Theoretical guarantees of state entropy regularization in RL.
Robustness to structured and spatially correlated perturbations.
Comparison of state entropy and policy entropy regularization benefits.
Innovation

Methods, ideas, or system contributions that make the work stand out.

State entropy regularization enhances exploration robustness
Addresses spatially correlated perturbations effectively
Contrasts benefits with policy entropy regularization
🔎 Similar Papers
No similar papers found.