🤖 AI Summary
This study addresses the lack of systematic evaluation of large language models (LLMs) in understanding pragmatic phenomena in Chinese contexts, particularly politeness, impoliteness, and ironic politeness. Drawing on relational management theory and the model of ironic politeness, the authors construct the first trinary-class Chinese dataset comprising both authentic and simulated utterances. They evaluate six prominent LLMs—including GPT-5.1 and DeepSeek—under zero-shot, few-shot, knowledge-augmented, and hybrid prompting strategies. The findings reveal significant limitations in current models’ ability to recognize ironic politeness in Chinese, highlighting a critical gap in pragmatic competence. This work provides both a theoretical framework and an empirical benchmark to advance the development of pragmatically aware LLMs and foster humanistically grounded AI systems.
📝 Abstract
From a pragmatic perspective, this study systematically evaluates the differences in performance among representative large language models (LLMs) in recognizing politeness, impoliteness, and mock politeness phenomena in Chinese. Addressing the existing gaps in pragmatic comprehension, the research adopts the frameworks of Rapport Management Theory and the Model of Mock Politeness to construct a three-category dataset combining authentic and simulated Chinese discourse. Six representative models, including GPT-5.1 and DeepSeek, were selected as test subjects and evaluated under four prompting conditions: zero-shot, few-shot, knowledge-enhanced, and hybrid strategies. This study serves as a meaningful attempt within the paradigm of ``Great Linguistics,''offering a novel approach to applying pragmatic theory in the age of technological transformation. It also responds to the contemporary question of how technology and the humanities may coexist, representing an interdisciplinary endeavor that bridges linguistic technology and humanistic reflection.