🤖 AI Summary
To address strategic coordination challenges in long-horizon human–machine collaboration under incomplete information, this paper proposes a multi-step intention-driven online cooperative decision-making framework. Methodologically, we extend shared-control games to support multi-action rounds and introduce IntentMCTS—an online Monte Carlo Tree Search algorithm integrating dynamic belief memory, multi-step intention modeling, and intention-augmented reward shaping. Our core contribution lies in overcoming the limitations of single-step intention reasoning, enabling interpretable, communicable multi-step goal inference and real-time planning. Evaluated on the Gnomes at Night benchmark, our approach improves task completion rate by 18.52% while significantly reducing execution steps and control switches. A user study further demonstrates reduced cognitive load and higher satisfaction, confirming that multi-step intention modeling enhances both long-term collaborative efficiency and human-centered adaptability.
📝 Abstract
Strategic coordination between autonomous agents and human partners under incomplete information can be modeled as turn-based cooperative games. We extend a turn-based game under incomplete information, the shared-control game, to allow players to take multiple actions per turn rather than a single action. The extension enables the use of multi-step intent, which we hypothesize will improve performance in long-horizon tasks. To synthesize cooperative policies for the agent in this extended game, we propose an approach featuring a memory module for a running probabilistic belief of the environment dynamics and an online planning algorithm called IntentMCTS. This algorithm strategically selects the next action by leveraging any communicated multi-step intent via reward augmentation while considering the current belief. Agent-to-agent simulations in the Gnomes at Night testbed demonstrate that IntentMCTS requires fewer steps and control switches than baseline methods. A human-agent user study corroborates these findings, showing an 18.52% higher success rate compared to the heuristic baseline and a 5.56% improvement over the single-step prior work. Participants also report lower cognitive load, frustration, and higher satisfaction with the IntentMCTS agent partner.