Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work investigates the impact of model pruning on policy robustness in adversarial Markov decision processes (AMDPs). Addressing the critical question—“Does pruning degrade certified robustness?”—we establish the first theoretical framework for pruning in adversarial reinforcement learning, proving that element-wise pruning strictly tightens certified robustness bounds. Methodologically, we integrate magnitude-based and micro-pruning under a three-term regret decomposition framework, leveraging Lipschitz-constrained Gaussian and categorical policies to jointly characterize clean-task performance, pruning-induced loss, and robustness gain. Experiments demonstrate that moderate sparsity not only substantially improves certified robustness but also preserves—or even enhances—original task performance. Our results reveal the performance–robustness trade-off frontier and establish pruning as more than a compression technique: it serves as a structured mechanism for robustness optimization.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) policies deployed in real-world environments must remain reliable under adversarial perturbations. At the same time, modern deep RL agents are heavily over-parameterized, raising costs and fragility concerns. While pruning has been shown to improve robustness in supervised learning, its role in adversarial RL remains poorly understood. We develop the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes (SA-MDPs). For Gaussian and categorical policies with Lipschitz networks, we prove that element-wise pruning can only tighten certified robustness bounds; pruning never makes the policy less robust. Building on this, we derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains, exposing a fundamental performance--robustness frontier. Empirically, we evaluate magnitude and micro-pruning schedules on continuous-control benchmarks with strong policy-aware adversaries. Across tasks, pruning consistently uncovers reproducible ``sweet spots'' at moderate sparsity levels, where robustness improves substantially without harming - and sometimes even enhancing - clean performance. These results position pruning not merely as a compression tool but as a structural intervention for robust RL.

Problem

Research questions and friction points this paper is trying to address.

Establishes theoretical framework for certified robustness in adversarial RL pruning

Proves pruning tightens robustness bounds without compromising policy reliability

Identifies performance-robustness trade-offs through novel regret decomposition analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Element-wise pruning tightens certified robustness bounds

Three-term regret decomposition reveals performance-robustness frontier

Pruning discovers robustness sweet spots without performance loss

🔎 Similar Papers

No similar papers found.