ThermoHands: A Benchmark for 3D Hand Pose Estimation from Egocentric Thermal Image

📅 2024-03-14
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the significant performance degradation of RGB/NIR-based 3D hand pose estimation under illumination variations, occlusions, and intense sunlight interference, this paper introduces the first egocentric thermal-imaging-based 3D hand pose estimation framework. Our contributions are threefold: (1) We present ThermoHands—the first large-scale, multi-view, multi-spectral thermal hand dataset—accompanied by an automated 3D annotation pipeline; (2) We propose TherFormer, a dual-branch Transformer architecture that jointly leverages heatmap-driven feature learning and multi-view geometric constraints for robust pose estimation; (3) On the ThermoHands benchmark, TherFormer substantially outperforms state-of-the-art RGB/NIR methods, maintaining high accuracy and stability under challenging conditions—including gloved hands, low-light environments, and strong solar glare—thereby demonstrating the unique advantages of thermal imaging for egocentric hand perception.

Technology Category

Application Category

📝 Abstract
Designing egocentric 3D hand pose estimation systems that can perform reliably in complex, real-world scenarios is crucial for downstream applications. Previous approaches using RGB or NIR imagery struggle in challenging conditions: RGB methods are susceptible to lighting variations and obstructions like handwear, while NIR techniques can be disrupted by sunlight or interference from other NIR-equipped devices. To address these limitations, we present ThermoHands, the first benchmark focused on thermal image-based egocentric 3D hand pose estimation, demonstrating the potential of thermal imaging to achieve robust performance under these conditions. The benchmark includes a multi-view and multi-spectral dataset collected from 28 subjects performing hand-object and hand-virtual interactions under diverse scenarios, accurately annotated with 3D hand poses through an automated process. We introduce a new baseline method, TherFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight TherFormer's leading performance and affirm thermal imaging's effectiveness in enabling robust 3D hand pose estimation in adverse conditions.
Problem

Research questions and friction points this paper is trying to address.

3D hand pose estimation
egocentric thermal images
adverse conditions robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thermal imaging for 3D hand pose
Dual transformer modules in TherFormer
Multi-view, multi-spectral dataset for training
🔎 Similar Papers