Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

237K/year
🤖 AI Summary
Multimodal recommender systems are vulnerable to evasion-based promotional attacks, yet existing defense mechanisms are largely confined to single-modality settings and tailored to poisoning attacks, rendering them inadequate against such threats. This work proposes UAT-MC, which, for the first time, identifies the cross-modal gradient inconsistency issue in multi-user promotional scenarios and introduces a gradient alignment mechanism to jointly optimize the direction of untargeted adversarial perturbations across visual and textual modalities. By leveraging multimodal collaborative untargeted adversarial training, the method significantly enhances model robustness against evasion attacks while preserving recommendation performance, effectively striking a balance between defense strength and accuracy.
📝 Abstract
Multimodal recommender systems exploit visual and textual signals to alleviate data sparsity, but this also makes them more vulnerable to evasion-based promotion attacks. Existing defenses are largely limited to single-modal settings and mainly focus on poisoning-based threats, leaving evasion-based threats underexplored. In this work, we first identify a cross-modal gradient mismatch under the multi-user promotion setting, where visual and textual perturbations are optimized in inconsistent directions due to the dominance of distinct user groups. This phenomenon dilutes the attack effectiveness and leads robust training to underestimate worst-case risks. To address this issue, we propose Untargeted Adversarial Training with Multimodal Coordination (UAT-MC). UAT-MC tackles the challenge of unknown targeted items in evasion-based attacks (as opposed to poisoning-based attacks) by treating all items as potential targets, and introduces a gradient alignment mechanism to explicitly correct this mismatch. This design ensures synchronized perturbations across modalities, thereby maximizing adversarial strength for robust training. Extensive experiments demonstrate that UAT-MC significantly improves robustness against promotion attacks while maintaining acceptable recommendation performance under the defense-accuracy trade-off. Code is available at https://github.com/gmXian/UAT-MC.
Problem

Research questions and friction points this paper is trying to address.

multimodal recommender systems
evasion-based attacks
promotion attacks
adversarial robustness
cross-modal gradient mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal coordination
evasion-based attacks
adversarial training
gradient alignment
robust recommendation
🔎 Similar Papers