egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-World Tasks

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing egocentric vision benchmarks largely neglect intrinsic drivers—such as emotion and personality—focusing instead on physical behavior modeling, thereby limiting intent understanding. Method: We introduce the first multimodal egocentric dataset integrating first-person vision and physiological signals, featuring 43 participants and over 50 hours of real-world data. Synchronized recordings include eye-tracking video, photoplethysmography (PPG), and inertial measurement unit (IMU) data, annotated with fine-grained emotion labels (using the circumplex model and Mikels’ wheel) and personality traits (via the Big Five Inventory). Contribution/Results: We establish emotion and personality as core dimensions of egocentric perception and propose three novel benchmark tasks: continuous emotion prediction, discrete emotion recognition, and personality trait regression. Experiments demonstrate that vision-only models outperform conventional physiological-signal-based methods in real-world affect prediction, confirming the strong representational power of first-person visual cues for inferring internal states—opening a new direction for modeling behavioral intrinsic motivation.

Technology Category

Application Category

📝 Abstract
Understanding affect is central to anticipating human behavior, yet current egocentric vision benchmarks largely ignore the person's emotional states that shape their decisions and actions. Existing tasks in egocentric perception focus on physical activities, hand-object interactions, and attention modeling - assuming neutral affect and uniform personality. This limits the ability of vision systems to capture key internal drivers of behavior. In this paper, we present egoEMOTION, the first dataset that couples egocentric visual and physiological signals with dense self-reports of emotion and personality across controlled and real-world scenarios. Our dataset includes over 50 hours of recordings from 43 participants, captured using Meta's Project Aria glasses. Each session provides synchronized eye-tracking video, headmounted photoplethysmography, inertial motion data, and physiological baselines for reference. Participants completed emotion-elicitation tasks and naturalistic activities while self-reporting their affective state using the Circumplex Model and Mikels' Wheel as well as their personality via the Big Five model. We define three benchmark tasks: (1) continuous affect classification (valence, arousal, dominance); (2) discrete emotion classification; and (3) trait-level personality inference. We show that a classical learning-based method, as a simple baseline in real-world affect prediction, produces better estimates from signals captured on egocentric vision systems than processing physiological signals. Our dataset establishes emotion and personality as core dimensions in egocentric perception and opens new directions in affect-driven modeling of behavior, intent, and interaction.
Problem

Research questions and friction points this paper is trying to address.

Recognizing emotions and personality from egocentric vision and physiological signals
Addressing the gap in capturing internal affective states in egocentric perception
Establishing emotion and personality as core dimensions for behavior modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining egocentric vision with physiological signals for emotion recognition
Using wearable glasses to capture synchronized multimodal data streams
Establishing emotion and personality as core dimensions in egocentric perception
M
Matthias Jammot
Department of Computer Science, ETH Zurich, Switzerland
B
Bjöern Braun
Department of Computer Science, ETH Zurich, Switzerland
Paul Streli
Paul Streli
PhD student, ETH Zurich
Computer VisionMachine LearningHuman-Computer Interaction
R
Rafael Wampfler
Department of Computer Science, ETH Zurich, Switzerland
Christian Holz
Christian Holz
Associate Professor, ETH Zurich
Mixed RealityPerceptionHuman-Computer InteractionDigital Biomarkers