Self-Supervised Uncalibrated Multi-View Video Anonymization in the Operating Room

πŸ“… 2026-02-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes the first self-supervised, multi-view surgical video anonymization framework that operates without manual annotations or camera calibration, addressing the limited generalizability of existing methods to new clinical settings. By integrating low-threshold candidate proposal generation, uncalibrated multi-view self-supervised association, and temporal tracking, the framework iteratively recovers missed detections and generates pseudo-labels to fine-tune whole-body detection and pose estimation models. Evaluated on both the 4D-OR synthetic and real-world surgical datasets, the approach achieves over 97% recall and yields a high-performance detector capable of real-time inference, substantially enhancing the scalability and practicality of cross-scenario surgical video anonymization.

Technology Category

Application Category

πŸ“ Abstract
Privacy preservation is a prerequisite for using video data in Operating Room (OR) research. Effective anonymization relies on the exhaustive localization of every individual; even a single missed detection necessitates extensive manual correction. However, existing approaches face two critical scalability bottlenecks: (1) they usually require manual annotations of each new clinical site for high accuracy; (2) while multi-camera setups have been widely adopted to address single-view ambiguity, camera calibration is typically required whenever cameras are repositioned. To address these problems, we propose a novel self-supervised multi-view video anonymization framework consisting of whole-body person detection and whole-body pose estimation, without annotation or camera calibration. Our core strategy is to enhance the single-view detector by"retrieving"false negatives using temporal and multi-view context, and conducting self-supervised domain adaptation. We first run an off-the-shelf whole-body person detector in each view with a low-score threshold to gather candidate detections. Then, we retrieve the low-score false negatives that exhibit consistency with the high-score detections via tracking and self-supervised uncalibrated multi-view association. These recovered detections serve as pseudo labels to iteratively fine-tune the whole-body detector. Finally, we apply whole-body pose estimation on each detected person, and fine-tune the pose model using its own high-score predictions. Experiments on the 4D-OR dataset of simulated surgeries and our dataset of real surgeries show the effectiveness of our approach achieving over 97% recall. Moreover, we train a real-time whole-body detector using our pseudo labels, achieving comparable performance and highlighting our method's practical applicability. Code will be available at https://github.com/CAMMA-public/OR_anonymization.
Problem

Research questions and friction points this paper is trying to address.

video anonymization
operating room
multi-view
self-supervised
privacy preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning
multi-view video anonymization
uncalibrated camera setup
pseudo-labeling
whole-body pose estimation
πŸ”Ž Similar Papers
No similar papers found.
K
Keqi Chen
University of Strasbourg, CNRS, INSERM, ICube, UMR7357, France
V
V. Srivastav
IHU Strasbourg, 67000 Strasbourg, France
A
A. Vardazaryan
IHU Strasbourg, 67000 Strasbourg, France
C
Cindy Rolland
IHU Strasbourg, 67000 Strasbourg, France
Didier Mutter
Didier Mutter
Professeur de Chirurgie, HΓ΄pitaux Universitaires de Strasbourg
ChirurgieEnseignementInformatique
Nicolas Padoy
Nicolas Padoy
Professor of Computer Science, University of Strasbourg
Surgical Data ScienceMedical Image AnalysisComputer VisionMachine Learning