๐ค AI Summary
This work exposes an inherent vulnerability in the Segment Anything Model (SAM) encoder and its consequent single-point-of-failure risk for downstream vision tasks. To address this, we propose the first transferable adversarial attack method based on parametric simplicial complexes: we model shared vulnerable regions between SAM and downstream models as a simplicial complex with common vertices, and perform cross-domain attacks via iterative vertex optimization coupled with lightweight domain re-adaptation. Our approach integrates adversarial simplicial complex generation, vertex refinement, stochastic sampling, and few-shot domain adaptation. Evaluated across five diverse domain datasets, our method achieves an average attack success rate 12.7% higher than state-of-the-art methods, consistently triggering failures across multiple downstream modelsโthereby underscoring the critical impact of foundational vision model robustness on downstream reliability.
๐ Abstract
While the Segment Anything Model (SAM) transforms interactive segmentation with zero-shot abilities, its inherent vulnerabilities present a single-point risk, potentially leading to the failure of numerous downstream applications. Proactively evaluating these transferable vulnerabilities is thus imperative. Prior adversarial attacks on SAM often present limited transferability due to insufficient exploration of common weakness across domains. To address this, we propose Vertex-Refining Simplicial Complex Attack (VeSCA), a novel method that leverages only the encoder of SAM for generating transferable adversarial examples. Specifically, it achieves this by explicitly characterizing the shared vulnerable regions between SAM and downstream models through a parametric simplicial complex. Our goal is to identify such complexes within adversarially potent regions by iterative vertex-wise refinement. A lightweight domain re-adaptation strategy is introduced to bridge domain divergence using minimal reference data during the initialization of simplicial complex. Ultimately, VeSCA generates consistently transferable adversarial examples through random simplicial complex sampling. Extensive experiments demonstrate that VeSCA achieves performance improved by 12.7% compared to state-of-the-art methods across three downstream model categories across five domain-specific datasets. Our findings further highlight the downstream model risks posed by SAM's vulnerabilities and emphasize the urgency of developing more robust foundation models.