Keypoints as Dynamic Centroids for Unified Human Pose and Segmentation

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation in joint modeling of dynamic human pose estimation and instance segmentation—caused by occluded joints and rapid motion—this paper proposes the Keypoint-as-Dynamic-Centroid (KDC) paradigm. Our method introduces KeyCentroid, a learnable keypoint confidence enhancement module, and dynamically generates MaskCentroid in the embedding space from high-confidence keypoints to enable pixel-wise instance clustering. Adopting a bottom-up architecture, it jointly models keypoint heatmaps and adaptively generates centroids in the embedding space. Evaluated on CrowdPose, OCHuman, and COCO, our approach significantly improves segmentation accuracy and inference speed under occlusion and fast-motion scenarios, balancing precision and real-time capability. The core contribution is the first formulation of keypoints as learnable, propagatable dynamic centroids—unifying pose and segmentation representations within a single geometric framework.

Technology Category

Application Category

📝 Abstract
The dynamic movement of the human body presents a fundamental challenge for human pose estimation and body segmentation. State-of-the-art approaches primarily rely on combining keypoint heatmaps with segmentation masks but often struggle in scenarios involving overlapping joints or rapidly changing poses during instance-level segmentation. To address these limitations, we propose Keypoints as Dynamic Centroid (KDC), a new centroid-based representation for unified human pose estimation and instance-level segmentation. KDC adopts a bottom-up paradigm to generate keypoint heatmaps for both easily distinguishable and complex keypoints and improves keypoint detection and confidence scores by introducing KeyCentroids using a keypoint disk. It leverages high-confidence keypoints as dynamic centroids in the embedding space to generate MaskCentroids, allowing for swift clustering of pixels to specific human instances during rapid body movements in live environments. Our experimental evaluations on the CrowdPose, OCHuman, and COCO benchmarks demonstrate KDC's effectiveness and generalizability in challenging scenarios in terms of both accuracy and runtime performance. The implementation is available at: https://sites.google.com/view/niazahmad/projects/kdc.
Problem

Research questions and friction points this paper is trying to address.

Challenges in human pose estimation and segmentation due to dynamic body movement
Difficulties in handling overlapping joints and rapid pose changes
Need for unified approach combining keypoint detection and instance segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Keypoints as Dynamic Centroid (KDC) representation
KeyCentroids improve keypoint detection confidence
MaskCentroids enable swift pixel clustering
🔎 Similar Papers
No similar papers found.