🤖 AI Summary
Existing 2D detection paradigms for construction site safety violation identification suffer from limited capacity to model spatial interactions and complex scene semantics. This work proposes a novel 3D multi-view participatory violation recognition framework, reformulating the task as a cross-view interaction problem that jointly integrates worker–object contextual reasoning and explicit 3D spatial understanding. We introduce the Synthetic Indoor Construction Scene Generator (SICSG), the first domain-specific synthetic data engine for construction environments—enabling scalable, diverse, and illumination/occlusion-robust training data generation, thereby addressing critical gaps in both benchmark resources and synthetic data availability. Our method synergistically combines 3D geometric modeling, cross-view feature alignment, and synthetic-data-driven training. It achieves a 7.6% mAP improvement over state-of-the-art methods across four representative violation categories and demonstrates robust performance under realistic challenging conditions—including severe occlusions and complex lighting—paving a scalable pathway for intelligent safety monitoring in high-risk industries.
📝 Abstract
Recognizing safety violations in construction environments is critical yet remains underexplored in computer vision. Existing models predominantly rely on 2D object detection, which fails to capture the complexities of real-world violations due to: (i) an oversimplified task formulation treating violation recognition merely as object detection, (ii) inadequate validation under realistic conditions, (iii) absence of standardized baselines, and (iv) limited scalability from the unavailability of synthetic dataset generators for diverse construction scenarios. To address these challenges, we introduce Safe-Construct, the first framework that reformulates violation recognition as a 3D multi-view engagement task, leveraging scene-level worker-object context and 3D spatial understanding. We also propose the Synthetic Indoor Construction Site Generator (SICSG) to create diverse, scalable training data, overcoming data limitations. Safe-Construct achieves a 7.6% improvement over state-of-the-art methods across four violation types. We rigorously evaluate our approach in near-realistic settings, incorporating four violations, four workers, 14 objects, and challenging conditions like occlusions (worker-object, worker-worker) and variable illumination (back-lighting, overexposure, sunlight). By integrating 3D multi-view spatial understanding and synthetic data generation, Safe-Construct sets a new benchmark for scalable and robust safety monitoring in high-risk industries. Project Website: https://Safe-Construct.github.io/Safe-Construct