HapticLDM: A Diffusion Model for Text-to-Vibrotactile Generation

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
This work proposes the first latent diffusion model (LDM)-based text-to-haptic generation system to address the challenges of semantic misalignment and weak temporal consistency in vibrotactile feedback synthesis. By introducing a text encoding strategy that emphasizes dynamic semantic features, constructing a high-quality paired text-vibration dataset, and incorporating a global temporal denoising mechanism to enhance the stability and coherence of vibration envelopes, the approach overcomes the limitations of conventional autoregressive methods in modeling long-range dependencies. User studies (N=30) and A/B tests demonstrate that the proposed method significantly outperforms existing approaches in terms of realism, semantic fidelity, diversity, and physical accuracy, while also streamlining the haptic content design pipeline.
📝 Abstract
Text-to-vibration generation converts natural language into haptic feedback, enabling vibration-effect designers to get scenarios-fitted vibrations more efficiently, which shows great potentials in application fields such as metaverse, games, and film to enrich the user experience in interactive scenarios. The core challenge in this field is how to generate accurate, consistent, and complete vibrations according to textual semantics. Very recent autoregressive (AR) approaches (e.g., HapticGen) exhibit limited capacity in fully capturing global dependencies, owing to the inherent sequential nature of their modeling and prevailing data constraints. In this paper, we proposed HapticLDM, the first text-to-vibration generative model built upon Latent Diffusion Models (LDMs). Firstly, with respect to the data, we introduced a text-processing strategy that emphasizes dynamic characteristics to curate high-quality data pairs for fine-grained dynamic modeling. Secondly, HapticLDM incorporates a global denoising mechanism that regulates coherent and stable variations in the temporal envelope. Furthermore, we conduct extensive evaluations, including A/B testing against the state-of-the-art baseline and a user study involving 30 participants. The results demonstrate that our model enhances realism and semantic alignment. Qualitative feedback further indicates that HapticLDM simplifies the haptic design workflow while generating diverse, subtle, and physically precise vibrations.
Problem

Research questions and friction points this paper is trying to address.

text-to-vibration
haptic feedback
semantic alignment
vibrotactile generation
user experience
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Diffusion Model
text-to-vibration
haptic feedback generation
global denoising
dynamic modeling
🔎 Similar Papers
J
Jiahao Xiong
Technology Development Center, Guangzhou Shiyuan Electronic Technology Company Limited, Guangzhou 510535, China
F
Fei Wang
Technology Development Center, Guangzhou Shiyuan Electronic Technology Company Limited, Guangzhou 510535, China
Anran Xu
Anran Xu
Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia
inverse problemsmachine learning applications
Pinzhi Huang
Pinzhi Huang
New York University
Deep LearningComputer Vision
Tao Wen
Tao Wen
The University of Manchester
NetworksDecision makingData ScienceNonlinear dynamicsGame theory
L
Lijia Pan
School of Electronic Science and Engineering, Nanjing University, Nanjing 210093, China
C
Cai Chen
Technology Development Center, Guangzhou Shiyuan Electronic Technology Company Limited, Guangzhou 510535, China