🤖 AI Summary
This work investigates the intrinsic relationship between neural network compressibility—encompassing neuron sparsity and spectral compression—and adversarial robustness, addressing the challenge of jointly optimizing both properties. Method: We propose a unified analytical framework that theoretically characterizes how compression distorts the representation space and amplifies sensitivity along specific directions, thereby systematically degrading L∞/L2 robustness; we further integrate sensitivity analysis, spectral modeling, and sparsity control to conduct adversarial training and transfer experiments on synthetic and real-world tasks. Contribution/Results: Our analysis derives a theoretical lower bound on robustness, empirically demonstrating a fundamental tension between structured compression and adversarial robustness. Experiments confirm that compression-induced fragility persists across diverse robustification paradigms. These findings provide novel theoretical foundations and practical guidelines for designing models that are simultaneously efficient and secure.
📝 Abstract
Modern neural networks are expected to simultaneously satisfy a host of desirable properties: accurate fitting to training data, generalization to unseen inputs, parameter and computational efficiency, and robustness to adversarial perturbations. While compressibility and robustness have each been studied extensively, a unified understanding of their interaction still remains elusive. In this work, we develop a principled framework to analyze how different forms of compressibility - such as neuron-level sparsity and spectral compressibility - affect adversarial robustness. We show that these forms of compression can induce a small number of highly sensitive directions in the representation space, which adversaries can exploit to construct effective perturbations. Our analysis yields a simple yet instructive robustness bound, revealing how neuron and spectral compressibility impact $L_infty$ and $L_2$ robustness via their effects on the learned representations. Crucially, the vulnerabilities we identify arise irrespective of how compression is achieved - whether via regularization, architectural bias, or implicit learning dynamics. Through empirical evaluations across synthetic and realistic tasks, we confirm our theoretical predictions, and further demonstrate that these vulnerabilities persist under adversarial training and transfer learning, and contribute to the emergence of universal adversarial perturbations. Our findings show a fundamental tension between structured compressibility and robustness, and suggest new pathways for designing models that are both efficient and secure.