🤖 AI Summary
This study identifies and formally names a novel security vulnerability in large language models (LLMs) termed “emoji semantic confusion,” wherein ASCII emoticons induce silent failures—syntactically valid but semantically misaligned model outputs that deviate from user intent, potentially triggering unintended or even destructive actions. To systematically investigate this issue, the authors construct a dataset comprising 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and multiple levels of contextual complexity. Through automated test generation, cross-model behavioral analysis, and security evaluation, they demonstrate that leading LLMs exhibit an average confusion rate exceeding 38%. Moreover, conventional prompting strategies prove largely ineffective at mitigating this vulnerability, underscoring its prevalence and potential risk across current LLM deployments.
📝 Abstract
Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for Large Language Models (LLMs) remain largely unexplored. In this paper, we identify emoticon semantic confusion, a vulnerability where LLMs misinterpret ASCII-based emoticons to perform unintended and even destructive actions. To systematically study this phenomenon, we develop an automated data generation pipeline and construct a dataset containing 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and varying contextual complexities. Our study on six LLMs reveals that emoticon semantic confusion is pervasive, with an average confusion ratio exceeding 38%. More critically, over 90% of confused responses yield'silent failures', which are syntactically valid outputs but deviate from user intent, potentially leading to destructive security consequences. Furthermore, we observe that this vulnerability readily transfers to popular agent frameworks, while existing prompt-based mitigations remain largely ineffective. We call on the community to recognize this emerging vulnerability and develop effective mitigation methods to uphold the safety and reliability of the LLM system.