Backdooring Self-Supervised Contrastive Learning by Noisy Alignment

📅 2025-08-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Self-supervised contrastive learning (CL) is vulnerable to data-poisoning backdoor attacks (DPCL); however, existing DPCL methods suffer from limited attack efficacy due to their reliance on implicit co-occurrence patterns and insufficient suppression of discriminative features in backdoored images. To address this, we propose Noise-Aligned Poisoning (NAP), the first DPCL framework to explicitly model and suppress noise components inherent in backdoored images via a novel noise-alignment mechanism. NAP further optimizes poison sample spatial layout through randomized cropping, enabling precise representation-level manipulation of the CL encoder. We theoretically derive optimal alignment parameters to ensure both attack effectiveness and robustness. Extensive experiments demonstrate that NAP significantly outperforms state-of-the-art DPCL methods across multiple benchmarks and downstream tasks. Crucially, it maintains high attack success rates under mainstream defenses—including Clean-Label and Feature-Suppression—while preserving clean-sample performance.

Technology Category

Application Category

📝 Abstract

Self-supervised contrastive learning (CL) effectively learns transferable representations from unlabeled data containing images or image-text pairs but suffers vulnerability to data poisoning backdoor attacks (DPCLs). An adversary can inject poisoned images into pretraining datasets, causing compromised CL encoders to exhibit targeted misbehavior in downstream tasks. Existing DPCLs, however, achieve limited efficacy due to their dependence on fragile implicit co-occurrence between backdoor and target object and inadequate suppression of discriminative features in backdoored images. We propose Noisy Alignment (NA), a DPCL method that explicitly suppresses noise components in poisoned images. Inspired by powerful training-controllable CL attacks, we identify and extract the critical objective of noisy alignment, adapting it effectively into data-poisoning scenarios. Our method implements noisy alignment by strategically manipulating contrastive learning's random cropping mechanism, formulating this process as an image layout optimization problem with theoretically derived optimal parameters. The resulting method is simple yet effective, achieving state-of-the-art performance compared to existing DPCLs, while maintaining clean-data accuracy. Furthermore, Noisy Alignment demonstrates robustness against common backdoor defenses. Codes can be found at https://github.com/jsrdcht/Noisy-Alignment.

Problem

Research questions and friction points this paper is trying to address.

Addresses vulnerability of self-supervised contrastive learning to data poisoning backdoor attacks

Proposes Noisy Alignment method to suppress noise components in poisoned images

Achieves state-of-the-art attack performance while maintaining clean-data accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noisy Alignment suppresses noise in poisoned images

Manipulates contrastive learning's random cropping mechanism

Formulates image layout optimization with optimal parameters

🔎 Similar Papers

No similar papers found.

Authors to Follow