TxP: Reciprocal Generation of Ground Pressure Dynamics and Activity Descriptions for Improving Human Activity Recognition

📅 2025-05-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pressure sensors hold untapped potential for human activity recognition (HAR), yet their utility remains limited by severe data scarcity. Method: We propose the first text × pressure bidirectional generative framework, comprising: (1) PressLang—the first large-scale pressure-language paired dataset; (2) a multimodal model integrating CLIP-based cross-modal alignment, LLaMA-2 13B for language generation, and dynamic graph modeling of pressure sequences; and (3) atomic-action modeling for semantic-aware data augmentation and classification. Results: Our framework achieves a 12.4% macro-F1 improvement on real-world yoga and daily activity benchmarks, significantly outperforming state-of-the-art methods. This work establishes the first interpretable, bidirectional mapping between pressure signals and natural language—advancing pressure-sensor-based HAR toward interpretability, generalization, and data efficiency.

Technology Category

Application Category

📝 Abstract
Sensor-based human activity recognition (HAR) has predominantly focused on Inertial Measurement Units and vision data, often overlooking the capabilities unique to pressure sensors, which capture subtle body dynamics and shifts in the center of mass. Despite their potential for postural and balance-based activities, pressure sensors remain underutilized in the HAR domain due to limited datasets. To bridge this gap, we propose to exploit generative foundation models with pressure-specific HAR techniques. Specifically, we present a bidirectional Text$ imes$Pressure model that uses generative foundation models to interpret pressure data as natural language. TxP accomplishes two tasks: (1) Text2Pressure, converting activity text descriptions into pressure sequences, and (2) Pressure2Text, generating activity descriptions and classifications from dynamic pressure maps. Leveraging pre-trained models like CLIP and LLaMA 2 13B Chat, TxP is trained on our synthetic PressLang dataset, containing over 81,100 text-pressure pairs. Validated on real-world data for activities such as yoga and daily tasks, TxP provides novel approaches to data augmentation and classification grounded in atomic actions. This consequently improved HAR performance by up to 12.4% in macro F1 score compared to the state-of-the-art, advancing pressure-based HAR with broader applications and deeper insights into human movement.
Problem

Research questions and friction points this paper is trying to address.

Underutilization of pressure sensors in human activity recognition due to limited datasets.
Need for generative models to interpret pressure data as natural language descriptions.
Improving activity recognition performance by leveraging synthetic pressure-text paired data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional Text-Pressure model for HAR
Generative foundation models interpret pressure data
Synthetic PressLang dataset with 81,100 pairs
🔎 Similar Papers
No similar papers found.
L
L. Ray
DFKI, Germany
L
L. Krupp
RPTU and DFKI, Germany
V
V. F. Rey
RPTU and DFKI, Germany
B
Bo Zhou
RPTU and DFKI, Germany
Sungho Suh
Sungho Suh
Korea University
Generative ModelsHuman Activity RecognitionMultimodal LearningWearable Computing
P
P. Lukowicz
RPTU and DFKI, Germany