Impact of Connectivity on Laplacian Representations in Reinforcement Learning

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the curse of dimensionality in large-scale reinforcement learning by proposing a compact linear state representation derived from the spectral properties of the state graph Laplacian, estimated directly from trajectory samples. The key contribution lies in establishing, for the first time, a theoretical connection between linear value function approximation error and the algebraic connectivity of the state graph. The authors introduce an end-to-end error decomposition framework that clarifies several misconceptions regarding the use of Laplacian operators in reinforcement learning. Theoretical analysis yields an upper bound on the approximation error that explicitly depends on the graph’s connectivity, and empirical validation in grid-world environments demonstrates the effectiveness of the proposed approach.

Technology Category

Application Category

📝 Abstract

Learning compact state representations in Markov Decision Processes (MDPs) has proven crucial for addressing the curse of dimensionality in large-scale reinforcement learning (RL) problems. Existing principled approaches leverage structural priors on the MDP by constructing state representations as linear combinations of the state-graph Laplacian eigenvectors. When the transition graph is unknown or the state space is prohibitively large, the graph spectral features can be estimated directly via sample trajectories. In this work, we prove an upper bound on the approximation error of linear value function approximation under the learned spectral features. We show how this error scales with the algebraic connectivity of the state-graph, grounding the approximation quality in the topological structure of the MDP. We further bound the error introduced by the eigenvector estimation itself, leading to an end-to-end error decomposition across the representation learning pipeline. Additionally, our expression of the Laplacian operator for the RL setting, although equivalent to existing ones, prevents some common misunderstandings, of which we show some examples from the literature. Our results hold for general (non-uniform) policies without any assumptions on the symmetry of the induced transition kernel. We validate our theoretical findings with numerical simulations on gridworld environments.

Problem

Research questions and friction points this paper is trying to address.

Laplacian representations

reinforcement learning

approximation error

algebraic connectivity

Markov Decision Processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Laplacian representation

algebraic connectivity

value function approximation