🤖 AI Summary
In real-time human-AI collaboration, AI struggles to accurately model humans’ dynamic and heterogeneous mental states—such as domain-specific intentions—without direct communication, resulting in poor adaptability. To address this, we propose a dual-process, multi-scale Theory of Mind (ToM) framework grounded in large language models (LLMs), which incorporates “fast/slow” dual-system cognitive principles. The slow reasoning system integrates multi-scale mental modules to enable hierarchical and dynamic modeling of human intent. Crucially, the framework operates without explicit human feedback, significantly enhancing AI’s robustness to unseen human behaviors. Experiments demonstrate substantial improvements over baseline methods in both collaborative efficiency and intent recognition accuracy. Ablation studies confirm the indispensable role of the multi-scale modules within the slow system. This work establishes a novel paradigm for interpretable and adaptive human-AI collaboration.
📝 Abstract
Real-time human-artificial intelligence (AI) collaboration is crucial yet challenging, especially when AI agents must adapt to diverse and unseen human behaviors in dynamic scenarios. Existing large language model (LLM) agents often fail to accurately model the complex human mental characteristics such as domain intentions, especially in the absence of direct communication. To address this limitation, we propose a novel dual process multi-scale theory of mind (DPMT) framework, drawing inspiration from cognitive science dual process theory. Our DPMT framework incorporates a multi-scale theory of mind (ToM) module to facilitate robust human partner modeling through mental characteristic reasoning. Experimental results demonstrate that DPMT significantly enhances human-AI collaboration, and ablation studies further validate the contributions of our multi-scale ToM in the slow system.