Did you just see that? Arbitrary view synthesis for egocentric replay of operating room workflows from ambient sensors

📅 2025-10-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional operating room (OR) observation relies on static surveillance cameras or subjective recall, failing to capture clinicians’ authentic egocentric perspectives—hindering surgical safety analysis, training, and workflow optimization. To address this, we propose a non-invasive egocentric viewpoint reconstruction framework that, for the first time, leverages existing fixed OR video streams and environmental sensor data to enable full-scene dynamic 3D reconstruction and arbitrary-viewpoint replay. Our method integrates multi-view geometric modeling with neural rendering and incorporates a diffusion model to enhance novel-view synthesis fidelity. Evaluated on multicenter surgical datasets, it achieves high-fidelity personalized field-of-view reconstruction and free-navigable viewpoint rendering—without requiring wearable devices or disrupting clinical workflows. This work transforms conventional OR surveillance systems into interactive 3D visual recording platforms, establishing a new paradigm for surgical cognition research, immersive training, and intelligent procedural analytics.

Technology Category

Application Category

📝 Abstract
Observing surgical practice has historically relied on fixed vantage points or recollections, leaving the egocentric visual perspectives that guide clinical decisions undocumented. Fixed-camera video can capture surgical workflows at the room-scale, but cannot reconstruct what each team member actually saw. Thus, these videos only provide limited insights into how decisions that affect surgical safety, training, and workflow optimization are made. Here we introduce EgoSurg, the first framework to reconstruct the dynamic, egocentric replays for any operating room (OR) staff directly from wall-mounted fixed-camera video, and thus, without intervention to clinical workflow. EgoSurg couples geometry-driven neural rendering with diffusion-based view enhancement, enabling high-visual fidelity synthesis of arbitrary and egocentric viewpoints at any moment. In evaluation across multi-site surgical cases and controlled studies, EgoSurg reconstructs person-specific visual fields and arbitrary viewpoints with high visual quality and fidelity. By transforming existing OR camera infrastructure into a navigable dynamic 3D record, EgoSurg establishes a new foundation for immersive surgical data science, enabling surgical practice to be visualized, experienced, and analyzed from every angle.
Problem

Research questions and friction points this paper is trying to address.

Reconstructs egocentric surgical viewpoints from fixed cameras
Enables arbitrary view synthesis without disrupting clinical workflow
Transforms operating room videos into navigable 3D records
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-driven neural rendering for viewpoint synthesis
Diffusion-based enhancement for visual fidelity
Reconstructing egocentric views from fixed cameras
🔎 Similar Papers
No similar papers found.
H
Han Zhang
Johns Hopkins University, Baltimore, 21218, MD, USA
Lalithkumar Seenivasan
Lalithkumar Seenivasan
Johns Hopkins University | National University of Singapore (PhD)
Healthcare AutomationMedical AIMedical RoboticsSurgical Data Science
J
Jose L. Porras
Johns Hopkins University, Baltimore, 21218, MD, USA; Johns Hopkins Medical Institutions, Baltimore, 21218, MD, USA
Roger D. Soberanis-Mukul
Roger D. Soberanis-Mukul
Researcher, Advanced Robotics and Computationally Augmented Environments Lab, Johns Hopkins
deep learning for medial applicationsmedical image segmentationmedical image classification
H
Hao Ding
Johns Hopkins University, Baltimore, 21218, MD, USA
Hongchao Shu
Hongchao Shu
Johns Hopkins University
Digital Twins in MedicineComputer VisionAugmented Reality
Benjamin D. Killeen
Benjamin D. Killeen
Postdoc, Technical University of Munich
Surgical Data ScienceMedical AIRoboticsSimulation
A
Ankita Ghosh
Johns Hopkins University, Baltimore, 21218, MD, USA
Lonny Yarmus
Lonny Yarmus
Johns Hopkins Medical Institutions, Baltimore, 21218, MD, USA
M
Masaru Ishii
Johns Hopkins Medical Institutions, Baltimore, 21218, MD, USA
A
Angela Christine Argento
Johns Hopkins Medical Institutions, Baltimore, 21218, MD, USA
Mathias Unberath
Mathias Unberath
Johns Hopkins University
Medical RoboticsComputer VisionAI/MLExtended RealityHCI