🤖 AI Summary
The adversarial robustness of zero-shot learning (ZSL) models remains poorly understood under both class-level and concept-level perturbations, particularly in generalized ZSL (GZSL), where models are vulnerable to conventional class attacks and semantic-level manipulations. Method: We systematically evaluate the adversarial vulnerability of three mainstream ZSL architectures, introducing (i) calibration-point-dependent class attacks—first revealing their critical dependence on classifier calibration—and (ii) class-bias-enhanced attack (CBEA), which induces complete robustness collapse across all calibration points, driving GZSL accuracy to zero. We further propose two novel concept-level attacks: class-preserving concept attack (CPconA) and non-class-preserving concept attack (NCPconA), enabling precise erasure or injection of semantic concepts to steer predictions. Contribution/Results: Our framework constitutes the first multi-granular adversarial evaluation suite jointly covering class- and concept-level perturbations, while preserving or disrupting semantic consistency—thereby exposing fundamental structural vulnerabilities inherent in ZSL models.
📝 Abstract
Zero-shot Learning (ZSL) aims to enable image classifiers to recognize images from unseen classes that were not included during training. Unlike traditional supervised classification, ZSL typically relies on learning a mapping from visual features to predefined, human-understandable class concepts. While ZSL models promise to improve generalization and interpretability, their robustness under systematic input perturbations remain unclear. In this study, we present an empirical analysis about the robustness of existing ZSL methods at both classlevel and concept-level. Specifically, we successfully disrupted their class prediction by the well-known non-target class attack (clsA). However, in the Generalized Zero-shot Learning (GZSL) setting, we observe that the success of clsA is only at the original best-calibrated point. After the attack, the optimal bestcalibration point shifts, and ZSL models maintain relatively strong performance at other calibration points, indicating that clsA results in a spurious attack success in the GZSL. To address this, we propose the Class-Bias Enhanced Attack (CBEA), which completely eliminates GZSL accuracy across all calibrated points by enhancing the gap between seen and unseen class probabilities.Next, at concept-level attack, we introduce two novel attack modes: Class-Preserving Concept Attack (CPconA) and NonClass-Preserving Concept Attack (NCPconA). Our extensive experiments evaluate three typical ZSL models across various architectures from the past three years and reveal that ZSL models are vulnerable not only to the traditional class attack but also to concept-based attacks. These attacks allow malicious actors to easily manipulate class predictions by erasing or introducing concepts. Our findings highlight a significant performance gap between existing approaches, emphasizing the need for improved adversarial robustness in current ZSL models.