TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the need for safe, continual learning and evaluation of embodied AI in surgical environments by proposing the first centimeter-accurate, dynamically interactive, high-fidelity digital twin of an operating room (OR). Methodologically, it establishes an end-to-end pipeline integrating: (1) pre-scanned video-driven static geometric reconstruction; (2) multi-view video-driven behavioral motion modeling; (3) joint rendering of static and dynamic elements; and (4) stereo and monocular sensor data simulation. The core contributions are: (1) the first dynamic OR digital twin achieving both photorealistic visual fidelity and behaviorally consistent human–robot interactions; and (2) synthetically generated data that enables foundational models—including FoundationStereo and ORB-SLAM3—to achieve geometric understanding and visual localization performance comparable to that attained on real indoor datasets, thereby validating sensor-level realism.

Technology Category

Application Category

📝 Abstract
Developing embodied AI for intelligent surgical systems requires safe, controllable environments for continual learning and evaluation. However, safety regulations and operational constraints in operating rooms (ORs) limit embodied agents from freely perceiving and interacting in realistic settings. Digital twins provide high-fidelity, risk-free environments for exploration and training. How we may create photorealistic and dynamic digital representations of ORs that capture relevant spatial, visual, and behavioral complexity remains unclear. We introduce TwinOR, a framework for constructing photorealistic, dynamic digital twins of ORs for embodied AI research. The system reconstructs static geometry from pre-scan videos and continuously models human and equipment motion through multi-view perception of OR activities. The static and dynamic components are fused into an immersive 3D environment that supports controllable simulation and embodied exploration. The proposed framework reconstructs complete OR geometry with centimeter level accuracy while preserving dynamic interaction across surgical workflows, enabling realistic renderings and a virtual playground for embodied AI systems. In our experiments, TwinOR simulates stereo and monocular sensor streams for geometry understanding and visual localization tasks. Models such as FoundationStereo and ORB-SLAM3 on TwinOR-synthesized data achieve performance within their reported accuracy on real indoor datasets, demonstrating that TwinOR provides sensor-level realism sufficient for perception and localization challenges. By establishing a real-to-sim pipeline for constructing dynamic, photorealistic digital twins of OR environments, TwinOR enables the safe, scalable, and data-efficient development and benchmarking of embodied AI, ultimately accelerating the deployment of embodied AI from sim-to-real.
Problem

Research questions and friction points this paper is trying to address.

Creating photorealistic digital twins of operating rooms for AI training
Capturing spatial, visual, and behavioral complexity in surgical environments
Enabling safe embodied AI development through realistic simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reconstructs static geometry from pre-scan videos
Models human and equipment motion via multi-view perception
Fuses static and dynamic components into immersive 3D environment
🔎 Similar Papers
No similar papers found.
H
Han Zhang
Johns Hopkins University, Baltimore, MD, USA.
Yiqing Shen
Yiqing Shen
Johns Hopkins
Roger D. Soberanis-Mukul
Roger D. Soberanis-Mukul
Researcher, Advanced Robotics and Computationally Augmented Environments Lab, Johns Hopkins
deep learning for medial applicationsmedical image segmentationmedical image classification
A
Ankita Ghosh
Johns Hopkins University, Baltimore, MD, USA.
H
Hao Ding
Johns Hopkins University, Baltimore, MD, USA.
Lalithkumar Seenivasan
Lalithkumar Seenivasan
Johns Hopkins University | National University of Singapore (PhD)
Healthcare AutomationMedical AIMedical RoboticsSurgical Data Science
J
Jose L. Porras
Johns Hopkins University, Baltimore, MD, USA.; Johns Hopkins Medical Institutions, Baltimore, MD, USA.
Z
Zhekai Mao
Johns Hopkins University, Baltimore, MD, USA.
Chenjia Li
Chenjia Li
Johns Hopkins University
W
Wenjie Xiao
Johns Hopkins University, Baltimore, MD, USA.
L
L. Yarmus
Johns Hopkins Medical Institutions, Baltimore, MD, USA.
A
A. Argento
Johns Hopkins Medical Institutions, Baltimore, MD, USA.
M
Masaru Ishii
Johns Hopkins Medical Institutions, Baltimore, MD, USA.
Mathias Unberath
Mathias Unberath
Johns Hopkins University
Medical RoboticsComputer VisionAI/MLExtended RealityHCI