🤖 AI Summary
This work addresses the limitations of traditional non-collaborative dialogue agents, which rely on handcrafted strategies that are costly to develop and difficult to scale. The authors propose an end-to-end approach based on large language models that automatically distills structured strategic knowledge directly from raw expert dialogue transcripts, constructing a hierarchical policy forest to jointly model short-term responses and long-term strategic planning. The method operates without human intervention, exhibits strong cross-task transferability, and supports behavioral diversity. Evaluated on two benchmark datasets, it achieves average performance gains of 9%–10% over existing methods, demonstrating its effectiveness and scalability.
📝 Abstract
Developing non-collaborative dialogue agents traditionally requires the manual, unscalable codification of expert strategies. We propose \ours, a method that leverages large language models to autonomously induce both strategy actions and planning logic directly from raw transcripts. METRO formalizes expert knowledge into a Strategy Forest, a hierarchical structure that captures both short-term responses (nodes) and long-term strategic foresight (branches). Experimental results across two benchmarks show that METRO demonstrates promising performance, outperforming existing methods by an average of 9%-10%. Our further analysis not only reveals the success behind METRO (strategic behavioral diversity and foresight), but also demonstrates its robust cross-task transferability. This offers new insights into building non-collaborative agents in a cost-effective and scalable way. Our code is available at https://github.com/Humphrey-0125/METRO.