CoBRA: Quantifying Strategic Language Use and LLM Pragmatics

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Prior pragmatics and LLM research predominantly examines cooperative communication, neglecting systematic evaluation of strategic language understanding in high-stakes, non-cooperative contexts—such as courtroom cross-examinations. Method: We propose CoBRA, the first framework to quantitatively assess LLMs’ pragmatic competence in adversarial discourse, introducing three interpretable metrics—BaT (Baseline Truthfulness), PaT (Pragmatic Adaptivity), and NRBaT (Non-Responsive Baseline Truthfulness)—and curating CHARM, the first annotated real-world courtroom cross-examination dataset. Contribution/Results: We find that mainstream LLMs exhibit pervasive strategic pragmatic deficits; counterintuitively, reasoning-augmented models show significantly degraded performance, while scaling parameters yields only marginal gains. These results uncover a potential tension between pragmatic and logical reasoning capabilities in LLMs, providing both theoretical insight and empirical benchmarks for developing trustworthy language models in non-cooperative settings.

Technology Category

Application Category

📝 Abstract

Language is often used strategically, particularly in high-stakes, adversarial settings, yet most work on pragmatics and LLMs centers on cooperativity. This leaves a gap in systematic understanding of non-cooperative discourse. To address this, we introduce CoBRA (Cooperation-Breach Response Assessment), along with three interpretable metrics -- Benefit at Turn (BaT), Penalty at Turn (PaT), and Normalized Relative Benefit at Turn (NRBaT) -- to quantify the perceived strategic effects of discourse moves. We also present CHARM, an annotated dataset of real courtroom cross-examinations, to demonstrate the framework's effectiveness. Using these tools, we evaluate a range of LLMs and show that LLMs generally exhibit limited pragmatic understanding of strategic language. While model size shows an increase in performance on our metrics, reasoning ability does not help and largely hurts, introducing overcomplication and internal confusion.

Problem

Research questions and friction points this paper is trying to address.

Quantifying strategic language use in non-cooperative discourse

Assessing LLM understanding of adversarial communication pragmatics

Evaluating model performance on strategic language interpretation metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces CoBRA framework for strategic discourse analysis

Develops three interpretable metrics for quantifying effects

Presents CHARM dataset for courtroom cross-examinations evaluation

🔎 Similar Papers

No similar papers found.