Where are they looking in the operating room?

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

167K/year
🤖 AI Summary
This work addresses the lack of effective modeling of visual attention among clinical staff in the operating room, which has hindered a deeper understanding of surgical roles, procedural phases, and team communication. To bridge this gap, the study introduces gaze tracking as a novel task within surgical settings, leveraging extended versions of the 4D-OR and Team-OR datasets. The authors propose a method that combines gaze heatmaps with self-supervised spatiotemporal modeling to automatically infer clinical roles and surgical phases—and to model team communication—using only gaze predictions. The approach achieves F1 scores of 0.92 and 0.95 for role identification and phase segmentation, respectively, and improves team communication detection performance by over 30% compared to existing methods, significantly advancing the automation and intelligence of situational awareness in the operating room.

Technology Category

Application Category

📝 Abstract
Purpose: Gaze-following, the task of inferring where individuals are looking, has been widely studied in computer vision, advancing research in visual attention modeling, social scene understanding, and human-robot interaction. However, gaze-following has never been explored in the operating room (OR), a complex, high-stakes environment where visual attention plays an important role in surgical workflow analysis. In this work, we introduce the concept of gaze-following to the surgical domain, and demonstrate its great potential for understanding clinical roles, surgical phases, and team communications in the OR. Methods: We extend the 4D-OR dataset with gaze-following annotations, and extend the Team-OR dataset with gaze-following and a new team communication activity annotations. Then, we propose novel approaches to address clinical role prediction, surgical phase recognition, and team communication detection using a gaze-following model. For role and phase recognition, we propose a gaze heatmap-based approach that uses gaze predictions solely; for team communication detection, we train a spatial-temporal model in a self-supervised way that encodes gaze-based clip features, and then feed the features into a temporal activity detection model. Results: Experimental results on the 4D-OR and Team-OR datasets demonstrate that our approach achieves state-of-the-art performance on all downstream tasks. Quantitatively, our approach obtains F1 scores of 0.92 for clinical role prediction and 0.95 for surgical phase recognition. Furthermore, it significantly outperforms existing baselines in team communication detection, improving previous best performances by over 30%. Conclusion: We introduce gaze-following in the OR as a novel research direction in surgical data science, highlighting its great potential to advance surgical workflow analysis in computer-assisted interventions.
Problem

Research questions and friction points this paper is trying to address.

gaze-following
operating room
surgical workflow analysis
clinical role
team communication
Innovation

Methods, ideas, or system contributions that make the work stand out.

gaze-following
surgical workflow analysis
self-supervised learning
team communication detection
clinical role prediction
🔎 Similar Papers
No similar papers found.
K
Keqi Chen
University of Strasbourg, CNRS, INSERM, ICube, UMR7357, France
S
Séraphin Baributsa
University of Strasbourg, CNRS, INSERM, ICube, UMR7357, France
L
Lilien Schewski
Department for Biomedical Research (DBMR), University of Bern, 3008 Bern, Switzerland
Vinkle Srivastav
Vinkle Srivastav
Research Scientist (Chargé de recherche R&D) at CAMMA lab, IHU Strasbourg, France
Surgical data scienceVision-language modelsHuman pose estimationMedical image analysis
Didier Mutter
Didier Mutter
Professeur de Chirurgie, Hôpitaux Universitaires de Strasbourg
ChirurgieEnseignementInformatique
Guido Beldi
Guido Beldi
Professor of Surgery, Bern University Hospital
Surgeryimmunologyliverregeneration
S
Sandra Keller
Department for Biomedical Research (DBMR), University of Bern, 3008 Bern, Switzerland
Nicolas Padoy
Nicolas Padoy
Professor of Computer Science, University of Strasbourg
Surgical Data ScienceMedical Image AnalysisComputer VisionMachine Learning