🤖 AI Summary
This work addresses a critical security vulnerability in Retrieval-Augmented Generation (RAG) systems known as “centrality,” wherein a small set of documents consistently ranks at the top across diverse queries, enabling potential injection of malicious content or circumvention of safety filters. To counter this threat, the authors propose HubScan—the first centrality poisoning detection framework tailored for RAG—integrating statistical anomaly detection (via median/MAD-based z-scores), cross-cluster diffusion analysis, query perturbation stability testing, and domain- and modality-aware mechanisms. HubScan supports mainstream vector databases such as FAISS and Pinecone and accommodates multiple retrieval paradigms. Experiments on Food-101, MS-COCO, and FiQA benchmarks demonstrate that with only a 0.2% alert budget, HubScan achieves 90% recall, reaching 100% at 0.4%. Furthermore, domain-specific scanning captures all targeted attacks missed by global detection and effectively discriminates adversarial from benign content in million-scale real-world data from MS MARCO.
📝 Abstract
Retrieval-Augmented Generation (RAG) systems are essential to contemporary AI applications, allowing large language models to obtain external knowledge via vector similarity search. Nevertheless, these systems encounter a significant security flaw: hubness - items that frequently appear in the top-k retrieval results for a disproportionately high number of varied queries. These hubs can be exploited to introduce harmful content, alter search rankings, bypass content filtering, and decrease system performance.
We introduce hubscan, an open-source security scanner that evaluates vector indices and embeddings to identify hubs in RAG systems. Hubscan presents a multi-detector architecture that integrates: (1) robust statistical hubness detection utilizing median/MAD-based z-scores, (2) cluster spread analysis to assess cross-cluster retrieval patterns, (3) stability testing under query perturbations, and (4) domain-aware and modality-aware detection for category-specific and cross-modal attacks. Our solution accommodates several vector databases (FAISS, Pinecone, Qdrant, Weaviate) and offers versatile retrieval techniques, including vector similarity, hybrid search, and lexical matching with reranking capabilities.
We evaluate hubscan on Food-101, MS-COCO, and FiQA adversarial hubness benchmarks constructed using state-of-the-art gradient-optimized and centroid-based hub generation methods. hubscan achieves 90% recall at a 0.2% alert budget and 100% recall at 0.4%, with adversarial hubs ranking above the 99.8th percentile. Domain-scoped scanning recovers 100% of targeted attacks that evade global detection. Production validation on 1M real web documents from MS MARCO demonstrates significant score separation between clean documents and adversarial content. Our work provides a practical, extensible framework for detecting hubness threats in production RAG systems.