Interaction Dynamics as a Reward Signal for LLMs

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing large language model (LLM) alignment methods for multi-turn dialogue rely excessively on textual content-based rewards, neglecting interactive dynamics—a critical yet underexploited signal. Method: We propose TRACE, the first framework to introduce “dialogue geometry”: it models geometric properties of dialogue embedding trajectories—such as curvature and divergence—as structured reward signals, enabling explicit modeling of interaction patterns. Crucially, TRACE operates without access to raw dialogue text, ensuring privacy-preserving agent alignment and enabling diagnostic analysis of collaborative behaviors. It integrates geometric rewards with textual rewards in a hybrid reward model optimized via reinforcement learning. Contribution/Results: Using geometry-only rewards, TRACE achieves 68.20% pairwise accuracy—comparable to the full-text baseline (70.04%). The hybrid model further improves performance to 80.17%, empirically validating that interaction structure is equally essential as content for effective dialogue alignment.

Technology Category

Application Category

📝 Abstract

The alignment of Large Language Models (LLMs) for multi-turn conversations typically relies on reward signals derived from the content of the text. This approach, however, overlooks a rich, complementary source of signal: the dynamics of the interaction itself. This paper introduces TRACE (Trajectory-based Reward for Agent Collaboration Estimation), a novel reward signal derived from the geometric properties of a dialogue's embedding trajectory--a concept we term'conversational geometry'. Our central finding is that a reward model trained only on these structural signals achieves a pairwise accuracy (68.20%) comparable to a powerful LLM baseline that analyzes the full transcript (70.04%). Furthermore, a hybrid model combining interaction dynamics with textual analysis achieves the highest performance (80.17%), demonstrating their complementary nature. This work provides strong evidence that for interactive settings, how an agent communicates is as powerful a predictor of success as what it says, offering a new, privacy-preserving framework that not only aligns agents but also serves as a diagnostic tool for understanding the distinct interaction patterns that drive successful collaboration.

Problem

Research questions and friction points this paper is trying to address.

Developing reward signals from interaction dynamics for LLM alignment

Creating geometric trajectory-based rewards for conversational agents

Combining structural and textual signals to improve dialogue performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

TRACE uses conversational geometry for reward signals

Reward model trained on structural signals matches baseline

Hybrid model combining dynamics with text achieves highest performance

🔎 Similar Papers

No similar papers found.