🤖 AI Summary
This work investigates the impact of model pruning on policy robustness in adversarial Markov decision processes (AMDPs). Addressing the critical question—“Does pruning degrade certified robustness?”—we establish the first theoretical framework for pruning in adversarial reinforcement learning, proving that element-wise pruning strictly tightens certified robustness bounds. Methodologically, we integrate magnitude-based and micro-pruning under a three-term regret decomposition framework, leveraging Lipschitz-constrained Gaussian and categorical policies to jointly characterize clean-task performance, pruning-induced loss, and robustness gain. Experiments demonstrate that moderate sparsity not only substantially improves certified robustness but also preserves—or even enhances—original task performance. Our results reveal the performance–robustness trade-off frontier and establish pruning as more than a compression technique: it serves as a structured mechanism for robustness optimization.
📝 Abstract
Reinforcement learning (RL) policies deployed in real-world environments must remain reliable under adversarial perturbations. At the same time, modern deep RL agents are heavily over-parameterized, raising costs and fragility concerns. While pruning has been shown to improve robustness in supervised learning, its role in adversarial RL remains poorly understood. We develop the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes (SA-MDPs). For Gaussian and categorical policies with Lipschitz networks, we prove that element-wise pruning can only tighten certified robustness bounds; pruning never makes the policy less robust. Building on this, we derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains, exposing a fundamental performance--robustness frontier. Empirically, we evaluate magnitude and micro-pruning schedules on continuous-control benchmarks with strong policy-aware adversaries. Across tasks, pruning consistently uncovers reproducible ``sweet spots'' at moderate sparsity levels, where robustness improves substantially without harming - and sometimes even enhancing - clean performance. These results position pruning not merely as a compression tool but as a structural intervention for robust RL.