SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures

📅 2024-10-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

244K/year
🤖 AI Summary
To address the challenge of simultaneously achieving generalization and robustness in model-based architectures (MBAs) for inverse problem solving, this paper proposes an iterative stochastic gradient jittering training strategy: controlled noise is injected layer-wise during SGD optimization, with theoretical guarantees demonstrating superior average-case adversarial performance over standard MSE training—yielding enhanced generalization and robustness. This work introduces, for the first time, iterative-granularity gradient jittering into MBA training, integrating it with an extended SPGD variant and model-unfolding frameworks (e.g., recurrent unfolding). The method is validated across diverse inverse problems—including image denoising, seismic deconvolution, and single-coil MRI reconstruction. Experiments show substantial improvements in reconstruction quality, greater stability on out-of-distribution data, and significantly enhanced robustness against various adversarial perturbations.

Technology Category

Application Category

📝 Abstract
Inverse problems aim to reconstruct unseen data from corrupted or perturbed measurements. While most work focuses on improving reconstruction quality, generalization accuracy and robustness are equally important, especially for safety-critical applications. Model-based architectures (MBAs), such as loop unrolling methods, are considered more interpretable and achieve better reconstructions. Empirical evidence suggests that MBAs are more robust to perturbations than black-box solvers, but the accuracy-robustness tradeoff in MBAs remains underexplored. In this work, we propose a simple yet effective training scheme for MBAs, called SGD jittering, which injects noise iteration-wise during reconstruction. We theoretically demonstrate that SGD jittering not only generalizes better than the standard mean squared error training but is also more robust to average-case attacks. We validate SGD jittering using denoising toy examples, seismic deconvolution, and single-coil MRI reconstruction. Both SGD jittering and its SPGD extension yield cleaner reconstructions for out-of-distribution data and demonstrates enhanced robustness against adversarial attacks.
Problem

Research questions and friction points this paper is trying to address.

Improving robustness and accuracy in model-based inverse problems
Exploring accuracy-robustness tradeoff in model-based architectures
Enhancing generalization against adversarial attacks and out-of-distribution data
Innovation

Methods, ideas, or system contributions that make the work stand out.

SGD jittering injects noise iteration-wise
Improves generalization and robustness together
Validated in denoising and MRI reconstruction