Group-DINOmics: Incorporating People Dynamics into DINO for Self-supervised Group Activity Feature Learning

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of representation learning in scenarios lacking annotated group activity labels by proposing a novel self-supervised pretraining approach. It introduces, for the first time, human optical flow estimation and group-related object localization as pretext tasks, integrating them within the DINO framework to jointly leverage local motion cues and global scene context. By co-optimizing local and global features, the method effectively learns semantic representations of group activities. Extensive experiments demonstrate significant performance gains in group activity retrieval and recognition across multiple public benchmarks, achieving state-of-the-art results. Ablation studies further confirm the effectiveness and novelty of each proposed component.
📝 Abstract
This paper proposes Group Activity Feature (GAF) learning without group activity annotations. Unlike prior work, which uses low-level static local features to learn GAFs, we propose leveraging dynamics-aware and group-aware pretext tasks, along with local and global features provided by DINO, for group-dynamics-aware GAF learning. To adapt DINO and GAF learning to local dynamics and global group features, our pretext tasks use person flow estimation and group-relevant object location estimation, respectively. Person flow estimation is used to represent the local motion of each person, which is an important cue for understanding group activities. In contrast, group-relevant object location estimation encourages GAFs to learn scene context (e.g., spatial relations of people and objects) as global features. Comprehensive experiments on public datasets demonstrate the state-of-the-art performance of our method in group activity retrieval and recognition. Our ablation studies verify the effectiveness of each component in our method. Code: https://github.com/tezuka0001/Group-DINOmics.
Problem

Research questions and friction points this paper is trying to address.

Group Activity Recognition
Self-supervised Learning
People Dynamics
Feature Learning
DINO
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning
group activity recognition
DINO
person flow estimation
scene context
🔎 Similar Papers
No similar papers found.