🤖 AI Summary
To address privacy concerns, poor cross-user/environment generalization, and scarce labeled data in remote monitoring of individuals with dementia living alone, this paper proposes a non-intrusive behavior recognition method based on structural vibration sensing. Our approach innovatively synthesizes vibration-domain training data from near-surface acoustic audio recordings and establishes a pretraining–fine-tuning paradigm. We design a cross-domain adaptive framework that— for the first time—enables cross-modal synthesis and transfer from audio to vibration signals, augmented by self-supervised learning to enhance few-shot adaptation. With only a minimal amount of real-world labeled data, personalized model fine-tuning achieves high-accuracy recognition of daily activities across multiple users and households. This significantly reduces reliance on manual annotation while improving model robustness and practical deployability.
📝 Abstract
One in four people dementia live alone, leading family members to take on caregiving roles from a distance. Many researchers have developed remote monitoring solutions to lessen caregiving needs; however, limitations remain including privacy preserving solutions, activity recognition, and model generalizability to new users and environments. Structural vibration sensor systems are unobtrusive solutions that have been proven to accurately monitor human information, such as identification and activity recognition, in controlled settings by sensing surface vibrations generated by activities. However, when deploying in an end user's home, current solutions require a substantial amount of labeled data for accurate activity recognition. Our scalable solution adapts synthesized data from near-surface acoustic audio to pretrain a model and allows fine tuning with very limited data in order to create a robust framework for daily routine tracking.