🤖 AI Summary
Traditional intent recognition focuses on individual agents, making it difficult to model collective goal formation in group collaboration. This paper formally introduces the novel task of *group intent prediction*, aiming to infer dynamically evolving group-level objectives from individual actions and interactions. To support this task, we present SHOT—the first large-scale, multi-view basketball video dataset—comprising 1,979 game clips annotated with six fine-grained individual attributes, capturing multi-agent, multi-view, and hierarchical intent characteristics. We further propose GIFT, a unified framework that jointly models fine-grained individual features and dynamic interaction graphs across multiple camera views to capture the spatiotemporal evolution of group intent. Extensive experiments demonstrate that SHOT significantly advances group intent modeling capability, and GIFT consistently outperforms all baselines across standard metrics. This work establishes foundational resources and methodology for understanding collaborative group behavior in complex, real-world settings.
📝 Abstract
Intention recognition has traditionally focused on individual intentions, overlooking the complexities of collective intentions in group settings. To address this limitation, we introduce the concept of group intention, which represents shared goals emerging through the actions of multiple individuals, and Group Intention Forecasting (GIF), a novel task that forecasts when group intentions will occur by analyzing individual actions and interactions before the collective goal becomes apparent. To investigate GIF in a specific scenario, we propose SHOT, the first large-scale dataset for GIF, consisting of 1,979 basketball video clips captured from 5 camera views and annotated with 6 types of individual attributes. SHOT is designed with 3 key characteristics: multi-individual information, multi-view adaptability, and multi-level intention, making it well-suited for studying emerging group intentions. Furthermore, we introduce GIFT (Group Intention ForecasTer), a framework that extracts fine-grained individual features and models evolving group dynamics to forecast intention emergence. Experimental results confirm the effectiveness of SHOT and GIFT, establishing a strong foundation for future research in group intention forecasting. The dataset is available at https://xinyi-hu.github.io/SHOT_DATASET.