🤖 AI Summary
This work addresses the prevalence of redundant high-dimensional neural representations in deep reinforcement learning, positing that task-relevant value and policy structures are inherently low-dimensional. The authors propose a lightweight, architecture-agnostic orthogonal projection bottleneck that constrains encoder features to a low-dimensional subspace without requiring auxiliary objectives, pretraining, or modifications to the underlying reinforcement learning algorithm. This approach preserves representational capacity while stabilizing representation geometry, thereby supporting the manifold hypothesis in reinforcement learning. Theoretical analysis integrates linear realizability assumptions with gradient dynamics, and empirical validation via effective rank metrics confirms successful representation compression. Experiments across single-task and multi-task benchmarks demonstrate that performance remains comparable or even improves when the bottleneck dimension exceeds a task-dependent threshold, with value representations compressible to extremely low dimensions without performance degradation.
📝 Abstract
Deep reinforcement learning (RL) agents commonly rely on high-dimensional neural representations, despite growing evidence that task-relevant value and policy structure may be intrinsically low-dimensional. In this work, we present a simple yet effective representation-level prior that inserts a fixed orthonormal projection to constrain encoder features to a low-dimensional subspace, requiring no auxiliary objectives, pretraining, or changes to the underlying RL algorithm. Under a linear realizability assumption, we prove that when the bottleneck dimension exceeds the intrinsic rank of the optimal value function in feature space, the bottleneck preserves expressivity and leaves the induced gradient dynamics unchanged up to an equivalent low-dimensional parameterization. Empirically, we find that across both single and multi-task benchmarks, baseline performance is either matched or improved once the bottleneck dimension exceeds a small task-dependent threshold; in many cases, value representations can be compressed to extremely low dimensions without loss, and the minimal sufficient dimension depends far more on environment complexity than encoder width. In addition, we analyze representation geometry and find that orthogonal bottlenecks stabilize feature norms and are associated with higher effective rank. Together, these results support a representation-space interpretation of the manifold hypothesis in reinforcement learning and position orthogonal bottlenecks as a lightweight, architecture-agnostic mechanism for shaping RL representations.