Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scalability and cross-domain generalization challenges in policy learning for large-scale networked systems (e.g., transportation, power grids, wireless mesh networks), this paper proposes GSAC—a novel Graph-based Sparse Actor-Critic framework. GSAC uniquely integrates sparse local causal discovery with meta Actor-Critic learning, employing causal masking to extract compact state and domain-factor representations within a κ-hop truncated neighborhood. Theoretically, GSAC guarantees identifiability of minimally influential variables, yields exponentially compact representations, and ensures both causal recovery and policy convergence under finite-sample conditions. Empirically, GSAC achieves rapid adaptation to unseen environments using only a few test trajectories, significantly outperforming from-scratch training and conventional transfer learning methods across multiple networked domains. It thus delivers strong generalization, scalability, and theoretical rigor—unifying causal representation learning with sample-efficient meta-reinforcement learning for networked systems.

Technology Category

Application Category

📝 Abstract
Large-scale networked systems, such as traffic, power, and wireless grids, challenge reinforcement-learning agents with both scale and environment shifts. To address these challenges, we propose GSAC (Generalizable and Scalable Actor-Critic), a framework that couples causal representation learning with meta actor-critic learning to achieve both scalability and domain generalization. Each agent first learns a sparse local causal mask that provably identifies the minimal neighborhood variables influencing its dynamics, yielding exponentially tight approximately compact representations (ACRs) of state and domain factors. These ACRs bound the error of truncating value functions to $κ$-hop neighborhoods, enabling efficient learning on graphs. A meta actor-critic then trains a shared policy across multiple source domains while conditioning on the compact domain factors; at test time, a few trajectories suffice to estimate the new domain factor and deploy the adapted policy. We establish finite-sample guarantees on causal recovery, actor-critic convergence, and adaptation gap, and show that GSAC adapts rapidly and significantly outperforms learning-from-scratch and conventional adaptation baselines.
Problem

Research questions and friction points this paper is trying to address.

Addressing scale and environment shifts in networked systems
Learning sparse local causal masks for neighborhood variables
Enabling efficient policy adaptation across multiple domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal representation learning with meta actor-critic
Sparse local causal masks for neighborhood variables
Compact representations enabling efficient graph learning
🔎 Similar Papers
No similar papers found.