The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models

📅 2025-05-24

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Prior studies lack fine-grained characterization of how pragmatic competence—such as implicature understanding and intention inference—evolves during large language model (LLM) training. Method: We introduce ALTPRAG, the first alternative-hypothesis-based dataset enabling dynamic tracking of pragmatic interpretation and contrastive reasoning; integrate pragmatic theory with multi-stage evaluation, cross-model comparison across 22 LLMs, and dual-dimensional analysis (cognitive–pragmatic). Contribution/Results: Pragmatic competence emerges progressively and decomposably: base models exhibit initial pragmatic sensitivity, with performance scaling with model size; supervised fine-tuning (SFT) and preference optimization (e.g., RLHF) substantially enhance cognitive–pragmatic synergistic reasoning. This work provides the first systematic account of the training origins and evolutionary trajectory of LLM pragmatic competence.

Technology Category

Application Category

📝 Abstract

Current large language models (LLMs) have demonstrated emerging capabilities in social intelligence tasks, including implicature resolution (Sravanthi et al. (2024)) and theory-of-mind reasoning (Shapira et al. (2024)), both of which require substantial pragmatic understanding. However, how LLMs acquire this competence throughout the training process remains poorly understood. In this work, we introduce ALTPRAG, a dataset grounded in the pragmatic concept of alternatives, designed to evaluate whether LLMs at different training stages can accurately infer nuanced speaker intentions. Each instance pairs two contextually appropriate but pragmatically distinct continuations, enabling fine-grained assessment of both pragmatic interpretation and contrastive reasoning. We systematically evaluate 22 LLMs across key training stages: pre-training, supervised fine-tuning (SFT), and preference optimization, to examine the development of pragmatic competence. Our results show that even base models exhibit notable sensitivity to pragmatic cues, which improves consistently with increases in model and data scale. Additionally, SFT and RLHF contribute further gains, particularly in cognitive-pragmatic reasoning. These findings highlight pragmatic competence as an emergent and compositional property of LLM training and offer new insights for aligning models with human communicative norms.

Problem

Research questions and friction points this paper is trying to address.

How LLMs acquire pragmatic competence during training

Evaluating LLMs' ability to infer speaker intentions

Assessing pragmatic interpretation and contrastive reasoning in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

ALTPRAG dataset evaluates pragmatic competence

Systematic evaluation across 22 LLMs stages

SFT and RLHF enhance cognitive-pragmatic reasoning

🔎 Similar Papers

No similar papers found.