🤖 AI Summary
The scarcity of high-quality, real-world hand-glove image datasets hinders safety and robustness in human-robot collaborative assembly in industrial settings. Method: We introduce HAGS—the first benchmark dataset for glove-box environments—comprising over 12K real multi-view frames with pixel-level hand/glove segmentation masks and, uniquely, quantified annotation uncertainty. To assess out-of-distribution robustness, we propose a chroma-key augmentation strategy to synthesize realistic adversarial samples. We further establish a real-time semantic segmentation benchmark to systematically evaluate state-of-the-art models. Results: Experiments reveal substantial performance degradation of existing methods under industrial conditions. HAGS fills a critical gap by providing the first real-world industrial hand dataset with uncertainty-aware annotations, offering a new open-source benchmark for safe human-robot collaboration.
📝 Abstract
Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven manufacturing solutions. Developing these techniques necessitates algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. Although substantial efforts have curated datasets for hand segmentation, most focus on residential or commercial domains. Existing datasets targeting industrial settings predominantly rely on synthetic data, which we demonstrate does not effectively transfer to real-world operations. Moreover, these datasets lack uncertainty estimations critical for safe collaboration. Addressing these gaps, we present HAGS: Hand and Glove Segmentation Dataset. This dataset provides challenging examples to build applications toward hand and glove segmentation in industrial human-robot collaboration scenarios as well as assess out-of-distribution images, constructed via green screen augmentations, to determine ML-classifier robustness. We study state-of-the-art, real-time segmentation models to evaluate existing methods. Our dataset and baselines are publicly available.