🤖 AI Summary
To address the insufficient robustness of deep reinforcement learning (DRL) against noise and adversarial perturbations, this paper proposes CAMP—a novel paradigm for provably robust policy training that jointly optimizes certified radius and expected return for the first time. Methodologically, CAMP introduces a surrogate loss function for global certified radius, derived from training statistics, and integrates Gaussian perturbation augmentation with policy imitation distillation to stabilize training. Theoretically, it establishes an analytical framework for certified radius under policy smoothing. Empirically, CAMP significantly improves the robustness–return trade-off across multiple tasks: its certified expected return reaches up to twice that of baseline methods, and—crucially—it achieves the first simultaneous improvement in both certified robustness and certified return. Thus, CAMP provides a practically effective and theoretically rigorous solution to robust DRL.
📝 Abstract
Deep reinforcement learning (DRL) has gained widespread adoption in control and decision-making tasks due to its strong performance in dynamic environments. However, DRL agents are vulnerable to noisy observations and adversarial attacks, and concerns about the adversarial robustness of DRL systems have emerged. Recent efforts have focused on addressing these robustness issues by establishing rigorous theoretical guarantees for the returns achieved by DRL agents in adversarial settings. Among these approaches, policy smoothing has proven to be an effective and scalable method for certifying the robustness of DRL agents. Nevertheless, existing certifiably robust DRL relies on policies trained with simple Gaussian augmentations, resulting in a suboptimal trade-off between certified robustness and certified return. To address this issue, we introduce a novel paradigm dubbed exttt{C}ertified-r exttt{A}dius- exttt{M}aximizing exttt{P}olicy ( exttt{CAMP}) training. exttt{CAMP} is designed to enhance DRL policies, achieving better utility without compromising provable robustness. By leveraging the insight that the global certified radius can be derived from local certified radii based on training-time statistics, exttt{CAMP} formulates a surrogate loss related to the local certified radius and optimizes the policy guided by this surrogate loss. We also introduce extit{policy imitation} as a novel technique to stabilize exttt{CAMP} training. Experimental results demonstrate that exttt{CAMP} significantly improves the robustness-return trade-off across various tasks. Based on the results, exttt{CAMP} can achieve up to twice the certified expected return compared to that of baselines. Our code is available at https://github.com/NeuralSec/camp-robust-rl.