GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search

📅 2024-12-30

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This work exposes a critical vulnerability in deep learning–driven dense embedding retrieval systems under malicious SEO attacks: adversaries can inject minimal adversarial text to hijack retrieval rankings for sensitive queries (e.g., celebrity names), causing malicious content to be prioritized. To address this, we propose GASLITE—the first black-box, gradient-driven adversarial text generation method that requires no access to training corpora, model parameters, or gradients. Leveraging geometric modeling of the embedding space, GASLITE enables mathematically verifiable rank manipulation. The generated adversarial passages preserve designated malicious semantics while achieving top-10 recall rates ≥90% for target concept queries. Evaluated across nine mainstream retrieval models, GASLITE improves attack success rate by ≥140% over baselines. Crucially, injecting adversarial text comprising ≤0.0001% of the corpus suffices to place malicious content within the top-10 results for 61–100% of unseen concept queries.

Technology Category

Application Category

📝 Abstract

Dense embedding-based text retrieval$unicode{x2013}$retrieval of relevant passages from corpora via deep learning encodings$unicode{x2013}$has emerged as a powerful method attaining state-of-the-art search results and popularizing the use of Retrieval Augmented Generation (RAG). Still, like other search methods, embedding-based retrieval may be susceptible to search-engine optimization (SEO) attacks, where adversaries promote malicious content by introducing adversarial passages to corpora. To faithfully assess and gain insights into the susceptibility of such systems to SEO, this work proposes the GASLITE attack, a mathematically principled gradient-based search method for generating adversarial passages without relying on the corpus content or modifying the model. Notably, GASLITE's passages (1) carry adversary-chosen information while (2) achieving high retrieval ranking for a selected query distribution when inserted to corpora. We use GASLITE to extensively evaluate retrievers' robustness, testing nine advanced models under varied threat models, while focusing on realistic adversaries targeting queries on a specific concept (e.g., a public figure). We found GASLITE consistently outperformed baselines by $geq$140% success rate, in all settings. Particularly, adversaries using GASLITE require minimal effort to manipulate search results$unicode{x2013}$by injecting a negligible amount of adversarial passages ($leq$0.0001% of the corpus), they could make them visible in the top-10 results for 61-100% of unseen concept-specific queries against most evaluated models. Inspecting variance in retrievers' robustness, we identify key factors that may contribute to models' susceptibility to SEO, including specific properties in the embedding space's geometry.

Problem

Research questions and friction points this paper is trying to address.

Deep Learning

Adversarial Attacks

Information Retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

GASLITE

malicious content injection

search system vulnerability

🔎 Similar Papers

Taxonomy and Analysis of Sensitive User Queries in Generative AI Search