🤖 AI Summary
Human motion videos in complex backgrounds often violate the union-of-subspaces (UoS) assumption, hindering effective non-overlapping human motion segmentation (HMS).
Method: This paper proposes a temporally consistent structured clustering framework for HMS. Its core innovation is the first introduction of a temporal rate reduction criterion to drive subspace alignment, enabling dynamic representations to naturally conform to the UoS structure and overcoming limitations of conventional assumptions. The framework further integrates information-theoretic rate reduction optimization, end-to-end structured representation learning, adaptive affinity graph construction, and spectral clustering.
Contribution/Results: The method is compatible with diverse feature extractors and achieves state-of-the-art performance across five standard HMS benchmarks. It significantly improves action boundary localization accuracy—particularly in cluttered background scenarios—demonstrating robustness and generalizability.
📝 Abstract
Human Motion Segmentation (HMS), which aims to partition videos into non-overlapping human motions, has attracted increasing research attention recently. Existing approaches for HMS are mainly dominated by subspace clustering methods, which are grounded on the assumption that high-dimensional temporal data align with a Union-of-Subspaces (UoS) distribution. However, the frames in video capturing complex human motions with cluttered backgrounds may not align well with the UoS distribution. In this paper, we propose a novel approach for HMS, named Temporal Rate Reduction Clustering ($ ext{TR}^2 ext{C}$), which jointly learns structured representations and affinity to segment the frame sequences in video. Specifically, the structured representations learned by $ ext{TR}^2 ext{C}$ maintain temporally consistent and align well with a UoS structure, which is favorable for the HMS task. We conduct extensive experiments on five benchmark HMS datasets and achieve state-of-the-art performances with different feature extractors.