🤖 AI Summary
Existing methods suffer significant degradation in 3D hand reconstruction quality on real-world monocular videos due to disturbances such as hand-object interactions, extreme poses, illumination variations, and motion blur. This work proposes an optimization-based 3D Gaussian splatting framework for hand reconstruction, which explicitly models these disturbances as time-varying biases in the attributes of 3D Gaussians. To mitigate their impact, we introduce a dynamic disturbance decoupling module and a disturbance-aware optimization strategy, complemented by per-frame anisotropic weighted masks that adaptively suppress perturbations in both spatial and temporal dimensions. Evaluated on a newly collected dataset and two public benchmarks, our method achieves state-of-the-art performance, yielding up to a 15.8% improvement in PSNR and a 23.1% reduction in LPIPS.
📝 Abstract
Despite recent progress in 3D hand reconstruction from monocular videos, most existing methods rely on data captured in well-controlled environments and therefore degrade in real-world settings with severe perturbations, such as hand-object interactions, extreme poses, illumination changes, and motion blur. To tackle these issues, we introduce WildGHand, an optimization-based framework that enables self-adaptive 3D Gaussian splatting on in-the-wild videos and produces high-fidelity hand avatars. WildGHand incorporates two key components: (i) a dynamic perturbation disentanglement module that explicitly represents perturbations as time-varying biases on 3D Gaussian attributes during optimization, and (ii) a perturbation-aware optimization strategy that generates per-frame anisotropic weighted masks to guide optimization. Together, these components allow the framework to identify and suppress perturbations across both spatial and temporal dimensions. We further curate a dataset of monocular hand videos captured under diverse perturbations to benchmark in-the-wild hand avatar reconstruction. Extensive experiments on this dataset and two public datasets demonstrate that WildGHand achieves state-of-the-art performance and substantially improves over its base model across multiple metrics (e.g., up to a $15.8\%$ relative gain in PSNR and a $23.1\%$ relative reduction in LPIPS). Our implementation and dataset are available at https://github.com/XuanHuang0/WildGHand.