COMETH: Convex Optimization for Multiview Estimation and Tracking of Humans

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the real-time, scalable human activity monitoring requirements in Industry 5.0, this work tackles two key challenges: high computational overhead from centralized multi-camera processing and spatiotemporal inconsistency and accuracy degradation in edge-distributed approaches. We propose a lightweight multi-view human pose fusion algorithm. Methodologically, it integrates kinematic and biomechanical constraints to enhance joint localization robustness; employs convex-optimization-driven inverse kinematics for cross-view spatial consistency; and introduces a state observer modeling joint dynamics to ensure temporal continuity. The resulting end-to-end framework achieves state-of-the-art performance on both public and industrial benchmarks—outperforming existing methods in pose localization, activity detection, and trajectory tracking—while maintaining low bandwidth consumption, sub-100ms latency, and high reliability, making it suitable for safety-critical industrial applications.

Technology Category

Application Category

📝 Abstract

In the era of Industry 5.0, monitoring human activity is essential for ensuring both ergonomic safety and overall well-being. While multi-camera centralized setups improve pose estimation accuracy, they often suffer from high computational costs and bandwidth requirements, limiting scalability and real-time applicability. Distributing processing across edge devices can reduce network bandwidth and computational load. On the other hand, the constrained resources of edge devices lead to accuracy degradation, and the distribution of computation leads to temporal and spatial inconsistencies. We address this challenge by proposing COMETH (Convex Optimization for Multiview Estimation and Tracking of Humans), a lightweight algorithm for real-time multi-view human pose fusion that relies on three concepts: it integrates kinematic and biomechanical constraints to increase the joint positioning accuracy; it employs convex optimization-based inverse kinematics for spatial fusion; and it implements a state observer to improve temporal consistency. We evaluate COMETH on both public and industrial datasets, where it outperforms state-of-the-art methods in localization, detection, and tracking accuracy. The proposed fusion pipeline enables accurate and scalable human motion tracking, making it well-suited for industrial and safety-critical applications. The code is publicly available at https://github.com/PARCO-LAB/COMETH.

Problem

Research questions and friction points this paper is trying to address.

Real-time multi-view human pose fusion with limited resources

Reducing computational costs and bandwidth in distributed systems

Improving accuracy and consistency in human motion tracking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates kinematic and biomechanical constraints

Employs convex optimization-based inverse kinematics

Implements state observer for temporal consistency

🔎 Similar Papers

Markerless Multi-view 3D Human Pose Estimation: a survey