🤖 AI Summary
Pressure sensors hold untapped potential for human activity recognition (HAR), yet their utility remains limited by severe data scarcity. Method: We propose the first text × pressure bidirectional generative framework, comprising: (1) PressLang—the first large-scale pressure-language paired dataset; (2) a multimodal model integrating CLIP-based cross-modal alignment, LLaMA-2 13B for language generation, and dynamic graph modeling of pressure sequences; and (3) atomic-action modeling for semantic-aware data augmentation and classification. Results: Our framework achieves a 12.4% macro-F1 improvement on real-world yoga and daily activity benchmarks, significantly outperforming state-of-the-art methods. This work establishes the first interpretable, bidirectional mapping between pressure signals and natural language—advancing pressure-sensor-based HAR toward interpretability, generalization, and data efficiency.
📝 Abstract
Sensor-based human activity recognition (HAR) has predominantly focused on Inertial Measurement Units and vision data, often overlooking the capabilities unique to pressure sensors, which capture subtle body dynamics and shifts in the center of mass. Despite their potential for postural and balance-based activities, pressure sensors remain underutilized in the HAR domain due to limited datasets. To bridge this gap, we propose to exploit generative foundation models with pressure-specific HAR techniques. Specifically, we present a bidirectional Text$ imes$Pressure model that uses generative foundation models to interpret pressure data as natural language. TxP accomplishes two tasks: (1) Text2Pressure, converting activity text descriptions into pressure sequences, and (2) Pressure2Text, generating activity descriptions and classifications from dynamic pressure maps. Leveraging pre-trained models like CLIP and LLaMA 2 13B Chat, TxP is trained on our synthetic PressLang dataset, containing over 81,100 text-pressure pairs. Validated on real-world data for activities such as yoga and daily tasks, TxP provides novel approaches to data augmentation and classification grounded in atomic actions. This consequently improved HAR performance by up to 12.4% in macro F1 score compared to the state-of-the-art, advancing pressure-based HAR with broader applications and deeper insights into human movement.