Robust Feature Learning for Multi-Index Models in High Dimensions

📅 2024-10-21

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work investigates adversarially robust feature learning under high-dimensional multi-index models. Focusing on ℓ₂-bounded perturbations with squared loss, we first prove that the model’s latent directions coincide precisely with the Bayes-optimal low-dimensional projection. Moreover, we reveal that representations learned by standard (non-robust) neural networks inherently possess a foundation for robustness. Under a statistical independence assumption, our theoretical analysis shows that the sample complexity of robust learning is independent of input dimensionality. Crucially, robustness can be achieved via robust fine-tuning of a linear readout layer—matching the difficulty of standard learning without incurring additional sample costs scaling with dimension. These results substantially reduce the training overhead for robustness in high-dimensional settings and provide both novel theoretical justification and a practical pathway toward scalable adversarial robustness.

Technology Category

Application Category

📝 Abstract

Recently, there have been numerous studies on feature learning with neural networks, specifically on learning single- and multi-index models where the target is a function of a low-dimensional projection of the input. Prior works have shown that in high dimensions, the majority of the compute and data resources are spent on recovering the low-dimensional projection; once this subspace is recovered, the remainder of the target can be learned independently of the ambient dimension. However, implications of feature learning in adversarial settings remain unexplored. In this work, we take the first steps towards understanding adversarially robust feature learning with neural networks. Specifically, we prove that the hidden directions of a multi-index model offer a Bayes optimal low-dimensional projection for robustness against $ell_2$-bounded adversarial perturbations under the squared loss, assuming that the multi-index coordinates are statistically independent from the rest of the coordinates. Therefore, robust learning can be achieved by first performing standard feature learning, then robustly tuning a linear readout layer on top of the standard representations. In particular, we show that adversarially robust learning is just as easy as standard learning. Specifically, the additional number of samples needed to robustly learn multi-index models when compared to standard learning does not depend on dimensionality.

Problem

Research questions and friction points this paper is trying to address.

Robust feature learning for multi-index models in high dimensions

Understanding adversarial robustness in neural network feature learning

Achieving robust learning with standard feature learning methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust feature learning via neural networks

Bayes optimal low-dimensional projection for robustness

Standard feature learning plus robust linear readout

🔎 Similar Papers

No similar papers found.