Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions

📅 2025-07-06

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

The visual hand gesture recognition (VHGR) field lacks a systematic, comprehensive survey, hindering researchers’ efficient selection of datasets, models, and methodologies. To address this gap, we present the first structured survey covering static, dynamic, and continuous gesture recognition. We propose a multidimensional taxonomy integrating input modalities, task types, and application scenarios. Within a deep learning framework, we systematically categorize mainstream network architectures, training strategies, dataset characteristics, annotation protocols, and evaluation metrics, and conduct a unified comparative analysis of performance and evolutionary trends. We explicitly identify practical challenges—including occlusion and illumination variation—and highlight cross-modal fusion and real-world robustness as critical future research directions. This work delivers a reusable technical roadmap and practical guidelines for advancing VHGR research and deployment.

Technology Category

Application Category

📝 Abstract

The rapid evolution of deep learning (DL) models and the ever-increasing size of available datasets have raised the interest of the research community in the always important field of vision-based hand gesture recognition (VHGR), and delivered a wide range of applications, such as sign language understanding and human-computer interaction using cameras. Despite the large volume of research works in the field, a structured and complete survey on VHGR is still missing, leaving researchers to navigate through hundreds of papers in order to find the right combination of data, model, and approach for each task. The current survey aims to fill this gap by presenting a comprehensive overview of this aspect of computer vision. With a systematic research methodology that identifies the state-of-the-art works and a structured presentation of the various methods, datasets, and evaluation metrics, this review aims to constitute a useful guideline for researchers, helping them to choose the right strategy for delving into a certain VHGR task. Starting with the methodology used for study selection, literature retrieval, and the analytical framing, the survey identifies and organizes key VHGR approaches using a taxonomy-based format in various dimensions such as input modality and application domain. The core of the survey provides an in-depth analysis of state-of-the-art techniques across three primary VHGR tasks: static gesture recognition, isolated dynamic gestures and continuous gesture recognition. For each task, the architectural trends and learning strategies are listed. Additionally, the study reviews commonly used datasets - emphasizing on annotation schemes - and evaluates standard performance metrics. It concludes by identifying major challenges in VHGR, including both general computer vision issues and domain-specific obstacles, and outlines promising directions for future research.

Problem

Research questions and friction points this paper is trying to address.

Surveying deep learning methods for visual hand gesture recognition

Identifying gaps in current VHGR research and datasets

Providing guidelines for selecting VHGR strategies and future directions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive survey on vision-based hand gesture recognition

Taxonomy-based organization of VHGR approaches

In-depth analysis of state-of-the-art techniques

🔎 Similar Papers

No similar papers found.

ByteDance

San Jose

Research Scientist Intern, Machine Perception for Input and Interaction (PhD)