🤖 AI Summary
Existing RAG defense methods struggle to balance computational efficiency with robustness against strong poisoning attacks and often overlook the contextual information embedded in retrieval ranking structures. This work identifies, for the first time, an anomalous strong alignment between the forward rank (semantic relevance to the query) and backward rank (contextual consistency with other retrieved documents) of poisoned documents. Leveraging this insight, the authors propose a lightweight dual-signal defense framework that integrates bidirectional ranking analysis with a dual-signal fusion strategy. Extensive experiments across three datasets, three retrievers, and three large language models demonstrate that the method reduces the success rate of PoisonedRAG attacks by up to 54%, improves task accuracy by up to 56%, and incurs less than one second of additional latency on average.
📝 Abstract
The growing adoption of Retrieval-Augmented Generation (RAG) has led to a rise in adversarial attacks. Existing defenses, relying on semantic analysis or voting, face a trade-off between high computational cost and limited robustness under strong poisoning attacks. Their fundamental limitation is the exclusive focus on semantic content relevance, while neglecting the retrieval context that is critically defined by ranking structures. To this end, we investigate the bidirectional ranking behavior of poisoned and benign documents, and discover a key discriminative pattern: poisoned documents exhibit significantly stronger alignment between their backward rankings and the query's forward ranking. Capitalizing on this, we propose BiRD, a bidirectional ranking defense mechanism built upon a dual-signal framework that leverages forward ranking to assess semantic content relevance and backward ranking to quantify ranking context consistency. This design directly addresses the fundamental limitation of prior approaches, enabling simultaneous efficiency and robustness. Extensive evaluation across 3 datasets with 3 retrievers and 3 LLMs under 2 attack scenarios validates BiRD's effectiveness. Notably, BiRD reduces the attack success rate of PoisonedRAG by up to 54% while simultaneously improving task accuracy by up to 56%, with average additional latency under 1 second.