Box Pose and Shape Estimation and Domain Adaptation for Large-Scale Warehouse Automation

๐Ÿ“… 2025-07-01
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing the scarcity of real-world annotated data for box pose and shape estimation in warehouse automation, this paper proposes a โ€œcorrect-and-verifyโ€ self-supervised domain adaptation framework enabling zero-shot simulation-to-reality transfer. Without human annotations, the method jointly optimizes domain alignment and 3D bounding box regression via self-supervised learning. Starting from a simulation-pretrained model, it performs self-supervised fine-tuning on 50,000 unlabeled real images. Its core innovation is a learnable geometric consistency verification mechanism that dynamically refines pseudo-labels and mitigates domain shift. Evaluated on a large-scale industrial real-world dataset, the approach significantly outperforms both pure simulation-trained and zero-shot baseline methods, achieving a 12.6% improvement in 3D detection accuracy (APโ‚…โ‚€). This work establishes a novel paradigm for robust pose estimation in annotation-free scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Modern warehouse automation systems rely on fleets of intelligent robots that generate vast amounts of data -- most of which remains unannotated. This paper develops a self-supervised domain adaptation pipeline that leverages real-world, unlabeled data to improve perception models without requiring manual annotations. Our work focuses specifically on estimating the pose and shape of boxes and presents a correct-and-certify pipeline for self-supervised box pose and shape estimation. We extensively evaluate our approach across a range of simulated and real industrial settings, including adaptation to a large-scale real-world dataset of 50,000 images. The self-supervised model significantly outperforms models trained solely in simulation and shows substantial improvements over a zero-shot 3D bounding box estimation baseline.
Problem

Research questions and friction points this paper is trying to address.

Self-supervised domain adaptation for warehouse robots
Estimating box pose and shape without manual annotations
Improving perception models using unlabeled real-world data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised domain adaptation pipeline
Correct-and-certify pose estimation
Leverages unlabeled real-world data
๐Ÿ”Ž Similar Papers
No similar papers found.