Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose Estimation

📅 2025-02-07
🏛️ International Journal of Computer Vision
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-species mammalian pose estimation faces challenges including non-rigid deformations, occlusions, and scarce annotations—stemming from inter-species variations in appearance, anatomy, and motion patterns. To address these, we propose the Keypoint Interaction Transformer (KIT), which explicitly models anatomical constraints and inter-joint dependencies. Our method integrates structure-aware graph attention, multi-scale feature alignment, and self-supervised keypoint relation distillation, enabling zero-shot generalization to unseen species. Evaluated on a large-scale cross-species benchmark covering 12 mammalian species, KIT achieves an average 8.3% improvement in PCKh and reduces cross-species transfer error by 37% over state-of-the-art general-purpose pose models. Our core contribution is the first end-to-end keypoint interaction modeling framework tailored for cross-species mammalian pose estimation—uniquely balancing structural priors with data efficiency.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

General mammal pose estimation
Address appearance and pose variances
Learn instance-level structure-supporting dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Keypoint Interactive Transformer for pose estimation
Generalised heatmap regression loss supervision
Adaptive weight strategy for keypoint imbalance
🔎 Similar Papers
No similar papers found.
T
Tianyang Xu
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
J
Jiyong Rao
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
Xiaoning Song
Xiaoning Song
Professor of Computer Vision and Pattern Recognition, Jiangnan University
Pattern RecognitionComputer VisionArtificial Intelligence
Z
Zhenhua Feng
School of Computer Science and Electronic Engineering, University of Surrey, Guildford GU2 7XH, UK
Xiao-Jun Wu
Xiao-Jun Wu
School of Artificial Intelligence and Computer Science, Jiangnan University
artificial intelligencepattern recognitionmachine learning