🤖 AI Summary
This work investigates the adversarial robustness of implicit neural representation (INR) parameter-space classifiers—a previously unexplored direction. Contrary to conventional pixel-space models, the authors empirically and theoretically establish that INR-based classifiers inherently exhibit strong adversarial robustness, without requiring dedicated robust training. To enable rigorous evaluation, they introduce the first specialized adversarial attack suite and a systematic framework for attack-defense analysis tailored to parameter-space classifiers. Extensive experiments demonstrate that such classifiers achieve 30–50% higher robust accuracy than pixel-space counterparts under both standard and custom-designed attacks, while incurring significantly lower inference computational overhead. The core contribution lies in uncovering and validating the intrinsic robustness mechanism of INR parameter-space classification—thereby establishing parameter space as a theoretically grounded and empirically validated paradigm for building more secure machine learning models.
📝 Abstract
Implicit Neural Representations (INRs) have been recently garnering increasing interest in various research fields, mainly due to their ability to represent large, complex data in a compact and continuous manner. Past work further showed that numerous popular downstream tasks can be performed directly in the INR parameter-space. Doing so can substantially reduce the computational resources required to process the represented data in their native domain. A major difficulty in using modern machine-learning approaches, is their high susceptibility to adversarial attacks, which have been shown to greatly limit the reliability and applicability of such methods in a wide range of settings. In this work, we show that parameter-space models trained for classification are inherently robust to adversarial attacks -- without the need of any robust training. To support our claims, we develop a novel suite of adversarial attacks targeting parameter-space classifiers, and furthermore analyze practical considerations of attacking parameter-space classifiers. Code for reproducing all experiments and implementation of all proposed methods will be released upon publication.