COMETH: Convex Optimization for Multiview Estimation and Tracking of Humans

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the real-time, scalable human activity monitoring requirements in Industry 5.0, this work tackles two key challenges: high computational overhead from centralized multi-camera processing and spatiotemporal inconsistency and accuracy degradation in edge-distributed approaches. We propose a lightweight multi-view human pose fusion algorithm. Methodologically, it integrates kinematic and biomechanical constraints to enhance joint localization robustness; employs convex-optimization-driven inverse kinematics for cross-view spatial consistency; and introduces a state observer modeling joint dynamics to ensure temporal continuity. The resulting end-to-end framework achieves state-of-the-art performance on both public and industrial benchmarks—outperforming existing methods in pose localization, activity detection, and trajectory tracking—while maintaining low bandwidth consumption, sub-100ms latency, and high reliability, making it suitable for safety-critical industrial applications.

Technology Category

Application Category

📝 Abstract
In the era of Industry 5.0, monitoring human activity is essential for ensuring both ergonomic safety and overall well-being. While multi-camera centralized setups improve pose estimation accuracy, they often suffer from high computational costs and bandwidth requirements, limiting scalability and real-time applicability. Distributing processing across edge devices can reduce network bandwidth and computational load. On the other hand, the constrained resources of edge devices lead to accuracy degradation, and the distribution of computation leads to temporal and spatial inconsistencies. We address this challenge by proposing COMETH (Convex Optimization for Multiview Estimation and Tracking of Humans), a lightweight algorithm for real-time multi-view human pose fusion that relies on three concepts: it integrates kinematic and biomechanical constraints to increase the joint positioning accuracy; it employs convex optimization-based inverse kinematics for spatial fusion; and it implements a state observer to improve temporal consistency. We evaluate COMETH on both public and industrial datasets, where it outperforms state-of-the-art methods in localization, detection, and tracking accuracy. The proposed fusion pipeline enables accurate and scalable human motion tracking, making it well-suited for industrial and safety-critical applications. The code is publicly available at https://github.com/PARCO-LAB/COMETH.
Problem

Research questions and friction points this paper is trying to address.

Real-time multi-view human pose fusion with limited resources
Reducing computational costs and bandwidth in distributed systems
Improving accuracy and consistency in human motion tracking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates kinematic and biomechanical constraints
Employs convex optimization-based inverse kinematics
Implements state observer for temporal consistency
🔎 Similar Papers
2024-07-04Image and Vision ComputingCitations: 3
E
Enrico Martini
Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy; GRASP Laboratory, Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
Ho Jin Choi
Ho Jin Choi
University of Pennsylvania
RoboticsHRIPerceptionControl
Nadia Figueroa
Nadia Figueroa
Presidential Assistant Professor, University of Pennsylvania
RoboticspHRILearning for ControlDynamical SystemsCollaborative Robots
Nicola Bombieri
Nicola Bombieri
University of Verona
Parallel ComputingHeterogeneous architecturesGPUParallel graph algorithms