🤖 AI Summary
Existing datasets lack multimodal benchmarks integrating radio-frequency (RF) sensors (e.g., mmWave radar, RFID), inertial measurement units (IMUs), and optical sensing (infrared cameras), hindering cross-modal joint modeling of human behavior and affective states. To address this, we introduce the first synchronized, spatiotemporally aligned multimodal RF dataset, comprising 13 mmWave radars, 6–8 RFID tags, LoRa communication modules, IMUs, and 24 infrared cameras—capturing gestures, activities, and emotional states. Our key contribution is the first system-level fusion of RF, inertial, and optical modalities, enabling end-to-end perception spanning low-level motion to high-level affect. Experimental evaluation demonstrates complementary sensor placement, yielding significant improvements in cross-modal activity recognition and emotion inference. The dataset establishes a reproducible benchmark for multi-task joint modeling and cross-modal learning in human-centered sensing.
📝 Abstract
Recent research has demonstrated the complementary nature of camera-based and inertial data for modeling human gestures, activities, and sentiment. Yet, despite its growing importance for environmental sensing as well as the advance of joint communication and sensing for prospective WiFi and 6G standards, a dataset that integrates these modalities with radio frequency data (radar and RFID) remains rare. We introduce RF-Behavior, a multimodal radio frequency dataset for comprehensive human behavior and emotion analysis. We collected data from 44 participants performing 21 gestures, 10 activities, and 6 sentiment expressions. Data were captured using synchronized sensors, including 13 radars (8 ground-mounted and 5 ceiling-mounted), 6 to 8 RFID tags (attached to each arm) and LoRa. Inertial measurement units (IMUs) and 24 infrared cameras are used to provide precise motion ground truth. RF-Behavior provides a unified multimodal dataset spanning the full spectrum of human behavior -- from brief gestures to activities and emotional states -- enabling research on multi-task learning across motion and emotion recognition. Benchmark results demonstrate that the strategic sensor placement is complementary across modalities, with distinct performance characteristics across different behavioral categories.