🤖 AI Summary
To address the challenge of robust image classification for intelligent devices (e.g., robots) operating under severe real-world degradations—including noise, blur, and occlusion—this paper proposes FROST, a novel framework for test-time robustness. FROST is the first method to leverage high-frequency responses from the Fast Fourier Transform (FFT) to explicitly identify degradation types and dynamically select the most suitable normalization statistics (mean or variance) per network layer, enabling layer-adaptive batch normalization. Crucially, it requires no model retraining and is plug-and-play. Evaluated on the severely degraded ImageNet-C subset, FROST reduces the mean Corruption Error (mCE) from 40.9% to 25.7%, achieving a 37.1% relative improvement—the new state-of-the-art. This work establishes an efficient, general-purpose paradigm for robust visual inference at test time.
📝 Abstract
Improving model robustness in case of corrupted images is among the key challenges to enable robust vision systems on smart devices, such as robotic agents. Particularly, robust test-time performance is imperative for most of the applications. This paper presents a novel approach to improve robustness of any classification model, especially on severely corrupted images. Our method (FROST) employs high-frequency features to detect input image corruption type, and select layer-wise feature normalization statistics. FROST provides the state-of-the-art results for different models and datasets, outperforming competitors on ImageNet-C by up to 37.1% relative gain, improving baseline of 40.9% mCE on severe corruptions.