Survey on Hand Gesture Recognition from Visual Input

📅 2025-01-21

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

The gesture recognition community lacks a systematic survey addressing both gesture classification and 3D hand pose estimation from multimodal visual inputs (e.g., RGB, depth, single/multi-view videos), particularly regarding key challenges—robustness in real-world scenarios, occlusion handling, cross-user generalization, and real-time inference. Method: This paper presents the first comprehensive, structured review of gesture and 3D hand pose recognition across multimodal inputs, categorizing methodologies—including classical machine learning, CNNs, RNNs, Transformers, graph convolutional networks, and multi-view geometric modeling—by input modality and task. It uniformly evaluates major benchmark datasets and application contexts, and introduces a cross-modal comparative framework. Contribution: We distill four critical open challenges—robustness, occlusion robustness, cross-user generalization, and real-time performance—and provide a clear, forward-looking research roadmap to guide future advances in multimodal hand understanding.

Technology Category

Application Category

📝 Abstract

Hand gesture recognition has become an important research area, driven by the growing demand for human-computer interaction in fields such as sign language recognition, virtual and augmented reality, and robotics. Despite the rapid growth of the field, there are few surveys that comprehensively cover recent research developments, available solutions, and benchmark datasets. This survey addresses this gap by examining the latest advancements in hand gesture and 3D hand pose recognition from various types of camera input data including RGB images, depth images, and videos from monocular or multiview cameras, examining the differing methodological requirements of each approach. Furthermore, an overview of widely used datasets is provided, detailing their main characteristics and application domains. Finally, open challenges such as achieving robust recognition in real-world environments, handling occlusions, ensuring generalization across diverse users, and addressing computational efficiency for real-time applications are highlighted to guide future research directions. By synthesizing the objectives, methodologies, and applications of recent studies, this survey offers valuable insights into current trends, challenges, and opportunities for future research in human hand gesture recognition.

Problem

Research questions and friction points this paper is trying to address.

Gesture Recognition

Review and Challenges

Real-world Applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hand Gesture Recognition

3D Hand Shape Identification

Real-time Computational Efficiency

🔎 Similar Papers

No similar papers found.

ByteDance

San Jose

Research Scientist Intern, Machine Perception for Input and Interaction (PhD)