🤖 AI Summary
This work addresses the challenges of continuous American Sign Language (ASL) fingerspelling recognition for deaf and hard-of-hearing users, where existing wearable solutions suffer from insufficient accuracy and practicality. We propose SpellRing, a thumb-worn smart ring featuring a novel single-ring multimodal sensing architecture that integrates active acoustic sensing (miniaturized speaker/microphone pair) with a six-axis inertial measurement unit (IMU), enabling joint modeling of static handshapes and dynamic finger motions. A lightweight CTC-based temporal neural network enables real-time, end-to-end recognition. Evaluated on unconstrained continuous ASL, SpellRing achieves 82.45% Top-1 and 92.42% Top-5 word-level accuracy, with a sentence-level word error rate of only 9.9%, robustly supporting a vocabulary of 1,164 words and 100 phrases. This is the first demonstration that a single-point, lightweight wearable can achieve high-accuracy continuous ASL recognition, establishing a portable, unobtrusive paradigm for accessible text input.
📝 Abstract
Fingerspelling is a critical part of American Sign Language (ASL) recognition and has become an accessible optional text entry method for Deaf and Hard of Hearing (DHH) individuals. In this paper, we introduce SpellRing, a single smart ring worn on the thumb that recognizes words continuously fingerspelled in ASL. SpellRing uses active acoustic sensing (via a microphone and speaker) and an inertial measurement unit (IMU) to track handshape and movement, which are processed through a deep learning algorithm using Connectionist Temporal Classification (CTC) loss. We evaluated the system with 20 ASL signers (13 fluent and 7 learners), using the MacKenzie-Soukoref Phrase Set of 1,164 words and 100 phrases. Offline evaluation yielded top-1 and top-5 word recognition accuracies of 82.45% (9.67%) and 92.42% (5.70%), respectively. In real-time, the system achieved a word error rate (WER) of 0.099 (0.039) on the phrases. Based on these results, we discuss key lessons and design implications for future minimally obtrusive ASL recognition wearables.