🤖 AI Summary
This work addresses the challenge of simultaneously enhancing robustness against transfer-based and query-based adversarial attacks. We systematically investigate the impact mechanisms of key hyperparameters—including learning rate, weight decay, momentum, and batch size—on dual robustness. Our study reveals, for the first time, a pronounced antagonistic effect of learning rate: higher values improve query-attack robustness but degrade transfer robustness, and vice versa. Leveraging this insight, we propose a collaborative hyperparameter optimization strategy tailored for joint defense, validated through centralized training, model ensembling, and distributed architectures. Experiments across diverse scenarios demonstrate substantial improvements in dual robustness: up to 64% gain in transfer-attack defense and up to 28% in query-attack defense. Notably, the distributed training architecture achieves the optimal trade-off between the two robustness objectives. This work fills a critical gap in hyperparameter-aware defense design for concurrent transfer and query attacks.
📝 Abstract
In this paper, we present the first detailed analysis of how optimization hyperparameters -- such as learning rate, weight decay, momentum, and batch size -- influence robustness against both transfer-based and query-based attacks. Supported by theory and experiments, our study spans a variety of practical deployment settings, including centralized training, ensemble learning, and distributed training. We uncover a striking dichotomy: for transfer-based attacks, decreasing the learning rate significantly enhances robustness by up to $64%$. In contrast, for query-based attacks, increasing the learning rate consistently leads to improved robustness by up to $28%$ across various settings and data distributions. Leveraging these findings, we explore -- for the first time -- the optimization hyperparameter design space to jointly enhance robustness against both transfer-based and query-based attacks. Our results reveal that distributed models benefit the most from hyperparameter tuning, achieving a remarkable tradeoff by simultaneously mitigating both attack types more effectively than other training setups.