ENIGMA-360: An Ego-Exo Dataset for Human Behavior Understanding in Industrial Scenarios

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of temporally synchronized, finely annotated first-person (ego) and third-person (exo) multi-view human activity data in real-world industrial settings—a key bottleneck for advancing intelligent assistance and safety systems. To bridge this gap, we introduce ENIGMA-360, a novel dataset captured in authentic industrial environments, comprising 180 time-synchronized ego-exo procedural video pairs with fine-grained spatiotemporal annotations. This is the first large-scale effort to achieve synchronized acquisition and detailed labeling in such complex real-world scenarios. We further define three benchmark tasks: temporal action segmentation, key step recognition, and egocentric human-object interaction detection. Baseline experiments reveal the limited performance of current methods, underscoring the need for robust ego-exo fusion models. The dataset and annotations are publicly released to foster community research.

Technology Category

Application Category

📝 Abstract
Understanding human behavior from complementary egocentric (ego) and exocentric (exo) points of view enables the development of systems that can support workers in industrial environments and enhance their safety. However, progress in this area is hindered by the lack of datasets capturing both views in realistic industrial scenarios. To address this gap, we propose ENIGMA-360, a new ego-exo dataset acquired in a real industrial scenario. The dataset is composed of 180 egocentric and 180 exocentric procedural videos temporally synchronized offering complementary information of the same scene. The 360 videos have been labeled with temporal and spatial annotations, enabling the study of different aspects of human behavior in industrial domain. We provide baseline experiments for 3 foundational tasks for human behavior understanding: 1) Temporal Action Segmentation, 2) Keystep Recognition and 3) Egocentric Human-Object Interaction Detection, showing the limits of state-of-the-art approaches on this challenging scenario. These results highlight the need for new models capable of robust ego-exo understanding in real-world environments. We publicly release the dataset and its annotations at https://iplab.dmi.unict.it/ENIGMA-360.
Problem

Research questions and friction points this paper is trying to address.

ego-exo dataset
human behavior understanding
industrial scenarios
temporal synchronization
behavioral annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

ego-exo dataset
industrial human behavior
temporal synchronization
multiview video annotation
human-object interaction
🔎 Similar Papers
No similar papers found.
F
Francesco Ragusa
LIVE@IPLAB, Department of Mathematics and Computer Science, University of Catania, Catania, Italy; Next Vision s.r.l. - Spinoff of the University of Catania, Catania, Italy
Rosario Leonardi
Rosario Leonardi
University of Catania
Computer VisionMachine LearningEgocentric Vision
Michele Mazzamuto
Michele Mazzamuto
Università degli Studi di Catania
artificial intelligence
Daniele Di Mauro
Daniele Di Mauro
Next Vision s.r.l
Machine LearningComputer VisionDeep Learning
C
Camillo Quattrocchi
LIVE@IPLAB, Department of Mathematics and Computer Science, University of Catania, Catania, Italy
A
Alessandro Passanisi
LIVE@IPLAB, Department of Mathematics and Computer Science, University of Catania, Catania, Italy
I
Irene D'Ambra
LIVE@IPLAB, Department of Mathematics and Computer Science, University of Catania, Catania, Italy
Antonino Furnari
Antonino Furnari
Assistant Professor at the University of Catania
Computer Vision
Giovanni Maria Farinella
Giovanni Maria Farinella
University of Catania
Computer VisionMachine Learning