🤖 AI Summary
This paper identifies “Contextual Distraction Vulnerability” (CDV)—a hierarchical capability deficit in large language models wherein performance degrades significantly under semantically coherent yet irrelevant contextual interference. To address this, we propose the first automated CDV test sample generation framework, leveraging tree-based search and adversarial context perturbation to systematically construct highly distracting inputs. Evaluated across four standard benchmarks, state-of-the-art models suffer an average accuracy drop of ~45%. We further design targeted post-training strategies, with the optimal approach recovering over 85% of original performance. Our key contributions are: (1) the first formal definition and empirical characterization of CDV; (2) a scalable, automated evaluation paradigm for contextual robustness; and (3) experimental validation that post-training effectively enhances resilience to contextual distraction—establishing a new benchmark and methodology for robustness research in LLMs.
📝 Abstract
Recent advances in Large Language Models (LLMs) have revolutionized generative systems, achieving excellent performance across diverse domains. Although these models perform well in controlled environments, their real-world applications frequently encounter inputs containing both essential and irrelevant details. Our investigation has revealed a critical vulnerability in LLMs, which we term Contextual Distraction Vulnerability (CDV). This phenomenon arises when models fail to maintain consistent performance on questions modified with semantically coherent but irrelevant context. To systematically investigate this vulnerability, we propose an efficient tree-based search methodology to automatically generate CDV examples. Our approach successfully generates CDV examples across four datasets, causing an average performance degradation of approximately 45% in state-of-the-art LLMs. To address this critical issue, we explore various mitigation strategies and find that post-targeted training approaches can effectively enhance model robustness against contextual distractions. Our findings highlight the fundamental nature of CDV as an ability-level challenge rather than a knowledge-level issue since models demonstrate the necessary knowledge by answering correctly in the absence of distractions. This calls the community's attention to address CDV during model development to ensure reliability. The code is available at https://github.com/wyf23187/LLM_CDV.