🤖 AI Summary
This work addresses the limited robustness of existing few-shot segmentation methods under real-world complexities such as illumination variations, motion blur, and object camouflage. To this end, we introduce a novel environment-robust few-shot segmentation setting and establish the ER-FSS benchmark, which encompasses eight diverse real-world scenarios. We further propose an Adaptive Attention Distillation (AAD) mechanism that dynamically extracts shared semantics between support and query images through semantic contrastive distillation, generating class-aware attention maps for novel categories to enhance segmentation performance. Extensive experiments demonstrate that our approach consistently improves mean Intersection-over-Union (mIoU) by 3.3%–8.5% across all datasets and settings, significantly boosting model generalization and robustness in dynamic real-world environments.
📝 Abstract
Few-shot segmentation (FSS) aims to rapidly learn novel class concepts from limited examples to segment specific targets in unseen images, and has been widely applied in areas such as medical diagnosis and industrial inspection. However, existing studies largely overlook the complex environmental factors encountered in real world scenarios-such as illumination, background, and camera viewpoint-which can substantially increase the difficulty of test images. As a result, models trained under laboratory conditions often fall short of practical deployment requirements. To bridge this gap, in this paper, an environment-robust FSS setting is introduced that explicitly incorporates challenging test cases arising from complex environments-such as motion blur, small objects, and camouflaged targets-to enhance model's robustness under realistic, dynamic conditions. An environment robust FSS benchmark (ER-FSS) is established, covering eight datasets across multiple real world scenarios. In addition, an Adaptive Attention Distillation (AAD) method is proposed, which repeatedly contrasts and distills key shared semantics between known (support) and unknown (query) images to derive class-specific attention for novel categories. This strengthens the model's ability to focus on the correct targets in complex environments, thereby improving environmental robustness. Comparative experiments show that AAD improves mIoU by 3.3% - 8.5% across all datasets and settings, demonstrating superior performance and strong generalization. The source code and dataset are available at: https://github.com/guoqianyu-alberta/Adaptive-Attention-Distillation-for-FSS.