🤖 AI Summary
This work addresses the challenge of real-time 3D human mesh recovery in online settings such as AR/VR, where existing methods—relying on offline processing or global optimization—fail to meet stringent latency and causality requirements. We propose the first fully online framework for world-coordinate human mesh reconstruction that simultaneously ensures causality, fidelity, temporal consistency, and computational efficiency while achieving high-accuracy pose and trajectory estimation. Our approach introduces a novel dual-branch architecture integrated with causal key-value caching, sliding-window learning, an ego-centric incremental SLAM alignment module, and a physically plausible trajectory refinement mechanism. Experiments demonstrate that our method attains accuracy comparable to state-of-the-art offline approaches on the EMDB benchmark and on highly dynamic in-the-wild videos, while uniquely enabling truly online inference.
📝 Abstract
Human mesh recovery (HMR) models 3D human body from monocular videos, with recent works extending it to world-coordinate human trajectory and motion reconstruction. However, most existing methods remain offline, relying on future frames or global optimization, which limits their applicability in interactive feedback and perception-action loop scenarios such as AR/VR and telepresence. To address this, we propose OnlineHMR, a fully online framework that jointly satisfies four essential criteria of online processing, including system-level causality, faithfulness, temporal consistency, and efficiency. Built upon a two-branch architecture, OnlineHMR enables streaming inference via a causal key-value cache design and a curated sliding-window learning strategy. Meanwhile, a human-centric incremental SLAM provides online world-grounded alignment under physically plausible trajectory correction. Experimental results show that our method achieves performance comparable to existing chunk-based approaches on the standard EMDB benchmark and highly dynamic custom videos, while uniquely supporting online processing. Page and code are available at https://tsukasane.github.io/Video-OnlineHMR/.