GlovEgo-HOI: Bridging the Synthetic-to-Real Gap for Industrial Egocentric Human-Object Interaction Detection

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of first-person human-object interaction (HOI) detection in industrial settings, where limited annotated data hinders the development of robust models. To overcome this, the authors propose a framework that integrates synthetic data generation with diffusion model-based augmentation to produce high-quality images featuring realistic personal protective equipment (PPE). They introduce GlovEgo-HOI, the first industrial-scale first-person HOI benchmark dataset, and present GlovEgo-Net, a multi-head network that jointly performs glove recognition and hand keypoint detection to enhance interaction understanding. Experimental results demonstrate significant improvements in detection accuracy. The released dataset, data augmentation pipeline, and pre-trained models are expected to advance research in this domain.

Technology Category

Application Category

📝 Abstract
Egocentric Human-Object Interaction (EHOI) analysis is crucial for industrial safety, yet the development of robust models is hindered by the scarcity of annotated domain-specific data. We address this challenge by introducing a data generation framework that combines synthetic data with a diffusion-based process to augment real-world images with realistic Personal Protective Equipment (PPE). We present GlovEgo-HOI, a new benchmark dataset for industrial EHOI, and GlovEgo-Net, a model integrating Glove-Head and Keypoint- Head modules to leverage hand pose information for enhanced interaction detection. Extensive experiments demonstrate the effectiveness of the proposed data generation framework and GlovEgo-Net. To foster further research, we release the GlovEgo-HOI dataset, augmentation pipeline, and pre-trained models at: GitHub project.
Problem

Research questions and friction points this paper is trying to address.

Egocentric Human-Object Interaction
Industrial Safety
Data Scarcity
PPE
Synthetic-to-Real Gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic-to-real gap
diffusion-based augmentation
egocentric human-object interaction
hand pose estimation
industrial safety
🔎 Similar Papers
No similar papers found.
A
Alfio Spoto
Department of Mathematics and Computer Science, University of Catania, Via S. Sofia, 64, 95125 Catania, Italy; Next Vision s.r.l., Italy
Rosario Leonardi
Rosario Leonardi
University of Catania
Computer VisionMachine LearningEgocentric Vision
F
Francesco Ragusa
Department of Mathematics and Computer Science, University of Catania, Via S. Sofia, 64, 95125 Catania, Italy; Next Vision s.r.l., Italy
Giovanni Maria Farinella
Giovanni Maria Farinella
University of Catania
Computer VisionMachine Learning