🤖 AI Summary
Retrieval-augmented generation (RAG) systems often generate erroneous responses to out-of-distribution (OOD) queries in safety-critical applications. Method: We propose a lightweight, robust OOD detection framework that jointly models retrieval and generation representations via two complementary dimensionality reduction and feature separation strategies—Principal Component Analysis (PCA) and Neural Collapse (NC)—and integrates GPT-4o with regression models for efficient detection. Response quality is rigorously validated through dual-track evaluation using large language models (LLMs) and human annotators. Contribution/Results: Experiments on standard benchmarks and a real-world COVID-19 vaccine chatbot demonstrate significant improvements in OOD detection accuracy and response relevance. Our results underscore the critical role of external OOD detectors in enhancing RAG safety. Notably, the simple PCA-based strategy outperforms complex baselines under realistic adversarial conditions, highlighting its deployment efficiency and practical utility.
📝 Abstract
Ensuring safety and in-domain responses for Retrieval-Augmented Generation (RAG) systems is paramount in safety-critical applications, yet remains a significant challenge. To address this, we evaluate four methodologies for Out-Of-Domain (OOD) query detection: GPT-4o, regression-based, Principal Component Analysis (PCA)-based, and Neural Collapse (NC), to ensure the RAG system only responds to queries confined to the system's knowledge base. Specifically, our evaluation explores two novel dimensionality reduction and feature separation strategies: extit{PCA}, where top components are selected using explained variance or OOD separability, and an adaptation of extit{Neural Collapse Feature Separation}. We validate our approach on standard datasets (StackExchange and MSMARCO) and real-world applications (Substance Use and COVID-19), including tests against LLM-simulated and actual attacks on a COVID-19 vaccine chatbot. Through human and LLM-based evaluations of response correctness and relevance, we confirm that an external OOD detector is crucial for maintaining response relevance.