🤖 AI Summary
This work proposes a novel semi-supervised approach for 3D object pose (rotation) regression that operates solely on a small set of labeled 2D images without requiring auxiliary information such as point clouds or CAD models. The method introduces a multi-stage, adaptive difficulty-aware curriculum mechanism that dynamically selects unlabeled samples—from easy to hard—for pseudo-label training, and incorporates a geometry-consistent structured data augmentation strategy specifically tailored for rotation estimation. Evaluated on PASCAL3D+ and ObjectNet3D, the approach significantly outperforms existing supervised and semi-supervised methods, demonstrating exceptional generalization capability particularly in low-label regimes.
📝 Abstract
Regressing 3D rotations of objects from 2D images is a crucial yet challenging task, with broad applications in autonomous driving, virtual reality, and robotic control. Existing rotation regression models often rely on large amounts of labeled data for training or require additional information beyond 2D images, such as point clouds or CAD models. Therefore, exploring semi-supervised rotation regression using only a limited number of labeled 2D images is highly valuable. While recent work FisherMatch introduces semi-supervised learning to rotation regression, it suffers from rigid entropy-based pseudo-label filtering that fails to effectively distinguish between reliable and unreliable unlabeled samples. To address this limitation, we propose a hardness-aware curriculum learning framework that dynamically selects pseudo-labeled samples based on their difficulty, progressing from easy to complex examples. We introduce both multi-stage and adaptive curriculum strategies to replace fixed-threshold filtering with more flexible, hardness-aware mechanisms. Additionally, we present a novel structured data augmentation strategy specifically tailored for rotation estimation, which assembles composite images from augmented patches to introduce feature diversity while preserving critical geometric integrity. Comprehensive experiments on PASCAL3D+ and ObjectNet3D demonstrate that our method outperforms existing supervised and semi-supervised baselines, particularly in low-data regimes, validating the effectiveness of our curriculum learning framework and structured augmentation approach.