RobustMask: Certified Robustness against Adversarial Neural Ranking Attack via Randomized Masking

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Neural ranking models are vulnerable to adversarial perturbations at the character-, word-, and phrase-level, enabling malicious manipulation of retrieval results and undermining the trustworthiness of search engines and RAG systems. To address this, we propose the first randomized masking smoothing framework tailored for ranking tasks, achieving the first provable top-K robustness certification. Our method synergistically integrates contextual modeling from pretrained language models, pairwise ranking structure, and probabilistic statistical analysis—enabling multi-granularity perturbation modeling without relying on strong assumptions or heuristic rules. Experiments demonstrate that over 20% of documents in the top-10 ranked results exhibit theoretically guaranteed robustness against perturbations affecting up to 30% of their content. This significantly enhances the security and reliability of real-world retrieval systems.

Technology Category

Application Category

📝 Abstract

Neural ranking models have achieved remarkable progress and are now widely deployed in real-world applications such as Retrieval-Augmented Generation (RAG). However, like other neural architectures, they remain vulnerable to adversarial manipulations: subtle character-, word-, or phrase-level perturbations can poison retrieval results and artificially promote targeted candidates, undermining the integrity of search engines and downstream systems. Existing defenses either rely on heuristics with poor generalization or on certified methods that assume overly strong adversarial knowledge, limiting their practical use. To address these challenges, we propose RobustMask, a novel defense that combines the context-prediction capability of pretrained language models with a randomized masking-based smoothing mechanism. Our approach strengthens neural ranking models against adversarial perturbations at the character, word, and phrase levels. Leveraging both the pairwise comparison ability of ranking models and probabilistic statistical analysis, we provide a theoretical proof of RobustMask's certified top-K robustness. Extensive experiments further demonstrate that RobustMask successfully certifies over 20% of candidate documents within the top-10 ranking positions against adversarial perturbations affecting up to 30% of their content. These results highlight the effectiveness of RobustMask in enhancing the adversarial robustness of neural ranking models, marking a significant step toward providing stronger security guarantees for real-world retrieval systems.

Problem

Research questions and friction points this paper is trying to address.

Defends neural ranking models against adversarial text manipulations

Provides certified robustness for top-K rankings via randomized masking

Addresses character, word, and phrase-level adversarial perturbations in retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses randomized masking with pretrained language models

Provides certified top-K robustness via statistical analysis

Defends against character, word, and phrase level attacks

🔎 Similar Papers

Improving the Robustness of Object Detection and Classification AI models against Adversarial Patch Attacks