🤖 AI Summary
To address the real-time, scalable human activity monitoring requirements in Industry 5.0, this work tackles two key challenges: high computational overhead from centralized multi-camera processing and spatiotemporal inconsistency and accuracy degradation in edge-distributed approaches. We propose a lightweight multi-view human pose fusion algorithm. Methodologically, it integrates kinematic and biomechanical constraints to enhance joint localization robustness; employs convex-optimization-driven inverse kinematics for cross-view spatial consistency; and introduces a state observer modeling joint dynamics to ensure temporal continuity. The resulting end-to-end framework achieves state-of-the-art performance on both public and industrial benchmarks—outperforming existing methods in pose localization, activity detection, and trajectory tracking—while maintaining low bandwidth consumption, sub-100ms latency, and high reliability, making it suitable for safety-critical industrial applications.
📝 Abstract
In the era of Industry 5.0, monitoring human activity is essential for ensuring both ergonomic safety and overall well-being. While multi-camera centralized setups improve pose estimation accuracy, they often suffer from high computational costs and bandwidth requirements, limiting scalability and real-time applicability. Distributing processing across edge devices can reduce network bandwidth and computational load. On the other hand, the constrained resources of edge devices lead to accuracy degradation, and the distribution of computation leads to temporal and spatial inconsistencies. We address this challenge by proposing COMETH (Convex Optimization for Multiview Estimation and Tracking of Humans), a lightweight algorithm for real-time multi-view human pose fusion that relies on three concepts: it integrates kinematic and biomechanical constraints to increase the joint positioning accuracy; it employs convex optimization-based inverse kinematics for spatial fusion; and it implements a state observer to improve temporal consistency. We evaluate COMETH on both public and industrial datasets, where it outperforms state-of-the-art methods in localization, detection, and tracking accuracy. The proposed fusion pipeline enables accurate and scalable human motion tracking, making it well-suited for industrial and safety-critical applications. The code is publicly available at https://github.com/PARCO-LAB/COMETH.