🤖 AI Summary
Tight convex relaxations (e.g., CROWN, DeepPoly) suffer from discontinuous, non-smooth, and perturbation-sensitive loss landscapes during certified robust training, often leading to degraded performance compared to looser relaxations. To address this, we propose the Gaussian Loss Smoothing (GLS) framework, which theoretically guarantees improved training stability under tight relaxations. We further introduce two novel optimization algorithms: zeroth-order Policy Gradient with Parameter Exploration (PGPE) and first-order Randomized Gradient Smoothing (RGS), enabling efficient certified training for non-differentiable tight relaxations. Empirically, GLS combined with tight relaxations achieves substantial gains over state-of-the-art methods on CIFAR-10 and Tiny-ImageNet—delivering significantly higher certified accuracy under identical network architectures. These results validate both the effectiveness and generalizability of GLS across diverse datasets and relaxation schemes.
📝 Abstract
Training neural networks with high certified accuracy against adversarial examples remains an open challenge despite significant efforts. While certification methods can effectively leverage tight convex relaxations for bound computation, in training, these methods, perhaps surprisingly, can perform worse than looser relaxations. Prior work hypothesized that this phenomenon is caused by the discontinuity, non-smoothness, and perturbation sensitivity of the loss surface induced by tighter relaxations. In this work, we theoretically show that Gaussian Loss Smoothing (GLS) can alleviate these issues. We confirm this empirically by instantiating GLS with two variants: a zeroth-order optimization algorithm, called PGPE, which allows training with non-differentiable relaxations, and a first-order optimization algorithm, called RGS, which requires gradients of the relaxation but is much more efficient than PGPE. Extensive experiments show that when combined with tight relaxations, these methods surpass state-of-the-art methods when training on the same network architecture for many settings. Our results clearly demonstrate the promise of Gaussian Loss Smoothing for training certifiably robust neural networks and pave a path towards leveraging tighter relaxations for certified training.