🤖 AI Summary
Large language models (LLMs) struggle with pragmatic inference—reasoning about implicit meaning beyond literal semantics and label dependence. Method: We introduce ImpliedMeaningPreference, the first pragmatic preference dataset featuring explicit correct/incorrect reasoning chains, and propose a novel Chain-of-Thought (CoT)-integrated pragmatic preference learning paradigm. This approach uniquely incorporates explicit reasoning guidance into preference tuning. Contribution/Results: Our method significantly improves zero-shot generalization to unseen pragmatic tasks (e.g., presupposition, deixis). Experiments across multiple LLM families show a 11.12% absolute gain in pragmatic understanding accuracy and a 16.10% improvement in cross-task transfer performance. These results demonstrate the effectiveness and scalability of reasoning-driven training for modeling pragmatic competence.
📝 Abstract
Pragmatics, the ability to infer meaning beyond literal interpretation, is crucial for social cognition and communication. While LLMs have been benchmarked for their pragmatic understanding, improving their performance remains underexplored. Existing methods rely on annotated labels but overlook the reasoning process humans naturally use to interpret implicit meaning. To bridge this gap, we introduce a novel pragmatic dataset, ImpliedMeaningPreference, that includes explicit reasoning (thoughts) for both correct and incorrect interpretations. Through preference-tuning and supervised fine-tuning, we demonstrate that thought-based learning significantly enhances LLMs' pragmatic understanding, improving accuracy by 11.12% across model families. We further discuss a transfer-learning study where we evaluate the performance of thought-based training for the other tasks of pragmatics (presupposition, deixis) that are not seen during the training time and observe an improvement of 16.10% compared to label-trained models.