EmoDistill: Offline Emotion Skill Distillation for Language Model Agents in Adversarial Negotiation

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the strategic vulnerability of human-preference-aligned language models in adversarial negotiations, where they are prone to yielding to opponents under emotionally charged language. To mitigate this, the authors propose EmoDistill, a novel framework that treats emotion not merely as stylistic surface variation but as a strategic channel of action. EmoDistill decomposes emotional negotiation skills into two stages via offline distillation: emotion selection and emotion expression. The former integrates GoEmotions-based prompting with Implicit Q-Learning (IQL), while the latter employs LoRA fine-tuning combining Supervised Fine-Tuning (SFT) and Judge Policy Optimization (JPO). Experiments across four high-stakes negotiation scenarios demonstrate that EmoDistill significantly outperforms baseline methods. Ablation studies confirm the critical role of emotion conditioning, and the approach exhibits strong generalization across domains and opponent types.

📝 Abstract

Post-trained LLMs are often optimized to align responses with human preferences, making them safe, polite, and conversationally appropriate. In adversarial negotiation, however, this alignment can become a vulnerability: emotionally framed language may steer agents toward the counterparty's interests. Using GoEmotions-based affective prompting, we show that emotion substantially shifts negotiation outcomes, suggesting that emotion is a strategic action channel rather than a surface style. Thus, we introduce \textbf{EmoDistill}, an offline framework for distilling emotional negotiation skills into language model agents. EmoDistill decomposes emotional strategy into emotion selection and emotion expression: an Implicit Q-Learning (IQL) selector learns \emph{which} emotion to express, while a Low-Rank Adaptation (LoRA)-based policy learns \emph{how} to express it through Supervised Fine-Tuning (SFT) and Judge Policy Optimization (JPO). Across four emotion-sensitive, high-stakes negotiation domains, SLM policies trained under the EmoDistill framework achieve the highest utility, outperforming vanilla SLM/LLM baselines and IQL-only emotion selection. Ablations show that emotion conditioning is essential, and transfer studies demonstrate generalization across domains, unseen counterparties, and trained-vs-trained tournaments. Overall, EmoDistill learns skills from offline agent-to-agent interactions, avoiding costly online negotiation during training.

Problem

Research questions and friction points this paper is trying to address.

adversarial negotiation

emotion strategy

language model agents

offline distillation

emotional vulnerability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Emotion Distillation

Offline Reinforcement Learning

Implicit Q-Learning (IQL)