🤖 AI Summary
To address the lack of AI support for low-resource Bangla Sign Language (BdSL) and the bidirectional communication barrier between Deaf signers and non-signers, this paper introduces Sign Language Instruction Generation (SLIG), a novel framework for generating pedagogical sign language instructions. Its core innovation is Sign Language Parameter Injection (SPI), a prompting technique that structurally encodes linguistic parameters—including handshape, movement, and orientation—into textual prompts to guide vision-language models (VLMs) in zero-shot generation of reproducible, high-fidelity instructional sequences. Leveraging SPI, we construct BdSLIG, the first Bangla Sign Language instruction generation dataset, comprising over 1,200 video–instruction pairs for training and evaluation. Experiments demonstrate that SPI significantly improves VLMs’ zero-shot performance on long-tail sign concepts (BLEU +2.8, CIDEr +4.1), establishing a new paradigm for low-resource sign language modeling and inclusive sign language learning systems.
📝 Abstract
Sign Language (SL) enables two-way communication for the deaf and hard-of-hearing community, yet many sign languages remain under-resourced in the AI space. Sign Language Instruction Generation (SLIG) produces step-by-step textual instructions that enable non-SL users to imitate and learn SL gestures, promoting two-way interaction. We introduce BdSLIG, the first Bengali SLIG dataset, used to evaluate Vision Language Models (VLMs) (i) on under-resourced SLIG tasks, and (ii) on long-tail visual concepts, as Bengali SL is unlikely to appear in the VLM pre-training data. To enhance zero-shot performance, we introduce Sign Parameter-Infused (SPI) prompting, which integrates standard SL parameters, like hand shape, motion, and orientation, directly into the textual prompts. Subsuming standard sign parameters into the prompt makes the instructions more structured and reproducible than free-form natural text from vanilla prompting. We envision that our work would promote inclusivity and advancement in SL learning systems for the under-resourced communities.