EmoDistill: Offline Emotion Skill Distillation for Language Model Agents in Adversarial Negotiation

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the strategic vulnerability of human-preference-aligned language models in adversarial negotiations, where they are prone to yielding to opponents under emotionally charged language. To mitigate this, the authors propose EmoDistill, a novel framework that treats emotion not merely as stylistic surface variation but as a strategic channel of action. EmoDistill decomposes emotional negotiation skills into two stages via offline distillation: emotion selection and emotion expression. The former integrates GoEmotions-based prompting with Implicit Q-Learning (IQL), while the latter employs LoRA fine-tuning combining Supervised Fine-Tuning (SFT) and Judge Policy Optimization (JPO). Experiments across four high-stakes negotiation scenarios demonstrate that EmoDistill significantly outperforms baseline methods. Ablation studies confirm the critical role of emotion conditioning, and the approach exhibits strong generalization across domains and opponent types.
📝 Abstract
Post-trained LLMs are often optimized to align responses with human preferences, making them safe, polite, and conversationally appropriate. In adversarial negotiation, however, this alignment can become a vulnerability: emotionally framed language may steer agents toward the counterparty's interests. Using GoEmotions-based affective prompting, we show that emotion substantially shifts negotiation outcomes, suggesting that emotion is a strategic action channel rather than a surface style. Thus, we introduce \textbf{EmoDistill}, an offline framework for distilling emotional negotiation skills into language model agents. EmoDistill decomposes emotional strategy into emotion selection and emotion expression: an Implicit Q-Learning (IQL) selector learns \emph{which} emotion to express, while a Low-Rank Adaptation (LoRA)-based policy learns \emph{how} to express it through Supervised Fine-Tuning (SFT) and Judge Policy Optimization (JPO). Across four emotion-sensitive, high-stakes negotiation domains, SLM policies trained under the EmoDistill framework achieve the highest utility, outperforming vanilla SLM/LLM baselines and IQL-only emotion selection. Ablations show that emotion conditioning is essential, and transfer studies demonstrate generalization across domains, unseen counterparties, and trained-vs-trained tournaments. Overall, EmoDistill learns skills from offline agent-to-agent interactions, avoiding costly online negotiation during training.
Problem

Research questions and friction points this paper is trying to address.

adversarial negotiation
emotion strategy
language model agents
offline distillation
emotional vulnerability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Emotion Distillation
Offline Reinforcement Learning
Implicit Q-Learning (IQL)
Low-Rank Adaptation (LoRA)
Adversarial Negotiation
🔎 Similar Papers
2024-01-29Conference on Empirical Methods in Natural Language ProcessingCitations: 3