Beyond Gaze Overlap: Analyzing Joint Visual Attention Dynamics Using Egocentric Data

📅 2025-09-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the dynamic modeling of joint visual attention (JVA) in multi-user social settings. We propose a first-person eye-tracking–based spatiotemporal tubular modeling approach: centered on individual gaze trajectories, we construct temporal-spatial tubes; integrate spatiotemporal convolutional networks with deep feature mapping to identify shared gaze patterns; and introduce, for the first time, the environment–focus attention coefficient *K*, quantifying attentional convergence and divergence under cooperative versus independent conditions. Evaluated on real-world collaborative tasks, our method achieves JVA recognition rates of 44–46%, substantially outperforming independent-task baselines (4–5%), thereby demonstrating strong discriminability for social interaction states. The work provides an interpretable computational representation of human attention coordination mechanisms and establishes a scalable multimodal analytical framework for psychological science and social robotics research.

Technology Category

Application Category

📝 Abstract
Joint visual attention (JVA) provides informative cues on human behavior during social interactions. The ubiquity of egocentric eye-trackers and large-scale datasets on everyday interactions offer research opportunities in identifying JVA in multi-user environments. We propose a novel approach utilizing spatiotemporal tubes centered on attention rendered by individual gaze and detect JVA using deep-learning-based feature mapping. Our results reveal object-focused collaborative tasks to yield higher JVA (44-46%), whereas independent tasks yield lower (4-5%) attention. Beyond JVA, we analyze attention characteristics using ambient-focal attention coefficient K to understand the qualitative aspects of shared attention. Our analysis reveals $mathcal{K}$ to converge instances where participants interact with shared objects while diverging when independent. While our study presents seminal findings on joint attention with egocentric commodity eye trackers, it indicates the potential utility of our approach in psychology, human-computer interaction, and social robotics, particularly in understanding attention coordination mechanisms in ecologically valid contexts.
Problem

Research questions and friction points this paper is trying to address.

Analyzing joint visual attention dynamics using egocentric data
Detecting joint visual attention through deep-learning feature mapping
Understanding attention coordination mechanisms in social interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatiotemporal tubes for joint attention analysis
Deep-learning-based feature mapping for JVA detection
Ambient-focal attention coefficient K characterization
🔎 Similar Papers
No similar papers found.
K
Kumushini Thennakoon
Department of Computer Science,Old Dominion University, Norfolk, V A, USA
Y
Yasasi Abeysinghe
Department of Computer Science,Old Dominion University, Norfolk, V A, USA
B
Bhanuka Mahanama
Department of Computer Science,Old Dominion University, Norfolk, V A, USA
V
Vikas Ashok
Department of Computer Science,Old Dominion University, Norfolk, V A, USA
Sampath Jayarathna
Sampath Jayarathna
Associate Professor of Computer Science, Old Dominion University. ONR Faculty Fellow, NSWC
data scienceneuro-information retrievaleye trackingdigital library@WebSciDL