CollabLLM: From Passive Responders to Active Collaborators

📅 2025-02-02

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Large language models (LLMs) exhibit passive responsiveness, shallow intent understanding, and insufficient collaborative proactivity in multi-turn dialogues—fundamentally due to single-turn reward modeling in reinforcement learning (RL) alignment. To address this, we propose a multi-turn-aware reward mechanism and a collaborative simulation framework that explicitly incorporates long-horizon interaction intent modeling into the RL alignment process, enabling deep intent inference and forward-looking suggestion generation. Our approach integrates collaborative interaction modeling and leverages a self-constructed multi-turn benchmark covering tasks such as document generation. Experiments demonstrate an average 18.5% improvement in task performance and a 46.3% gain in interactivity. User studies show a 17.6% increase in satisfaction and a 10.4% reduction in average interaction time. This work establishes a novel paradigm and reproducible technical pathway for building proactive, collaborative dialogue systems.

Technology Category

Application Category

📝 Abstract

Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction. As a result, they often respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations. To address these limitations, we introduce CollabLLM, a novel and general training framework that enhances multiturn human-LLM collaboration. Its key innovation is a collaborative simulation that estimates the long-term contribution of responses using Multiturn-aware Rewards. By reinforcement fine-tuning these rewards, CollabLLM goes beyond responding to user requests, and actively uncovers user intent and offers insightful suggestions-a key step towards more human-centered AI. We also devise a multiturn interaction benchmark with three challenging tasks such as document creation. CollabLLM significantly outperforms our baselines with averages of 18.5% higher task performance and 46.3% improved interactivity by LLM judges. Finally, we conduct a large user study with 201 judges, where CollabLLM increases user satisfaction by 17.6% and reduces user spent time by 10.4%.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Multi-turn Dialogue

Collaborative Engagement

Innovation

Methods, ideas, or system contributions that make the work stand out.

CollabLLM

Multi-round Dialogue

Enhanced User Satisfaction

🔎 Similar Papers

Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks