Photoreal Scene Reconstruction from an Egocentric Device

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient rolling-shutter distortion modeling and limited VIO trajectory accuracy—constrained by pixel-level reconstruction requirements—in first-person HDR scene reconstruction, this paper proposes a high-fidelity, physics-aware reconstruction framework. First, it introduces visual-inertial bundle adjustment (VIBA) for millisecond-scale joint time-motion calibration of RGB cameras, enabling precise rolling-shutter compensation. Second, it integrates a physically grounded imaging model into Gaussian Splatting to jointly characterize multi-exposure HDR capture and sensor response. Evaluated on both Project Aria and Meta Quest 3 platforms across diverse indoor/outdoor scenes under varying illumination, the method achieves a 2 dB PSNR improvement: +1 dB from VIBA and +1 dB from the physics-based model. All code, datasets, and acquisition configurations are publicly released.

Technology Category

Application Category

📝 Abstract
In this paper, we investigate the challenges associated with using egocentric devices to photorealistic reconstruct the scene in high dynamic range. Existing methodologies typically assume using frame-rate 6DoF pose estimated from the device's visual-inertial odometry system, which may neglect crucial details necessary for pixel-accurate reconstruction. This study presents two significant findings. Firstly, in contrast to mainstream work treating RGB camera as global shutter frame-rate camera, we emphasize the importance of employing visual-inertial bundle adjustment (VIBA) to calibrate the precise timestamps and movement of the rolling shutter RGB sensing camera in a high frequency trajectory format, which ensures an accurate calibration of the physical properties of the rolling-shutter camera. Secondly, we incorporate a physical image formation model based into Gaussian Splatting, which effectively addresses the sensor characteristics, including the rolling-shutter effect of RGB cameras and the dynamic ranges measured by sensors. Our proposed formulation is applicable to the widely-used variants of Gaussian Splats representation. We conduct a comprehensive evaluation of our pipeline using the open-source Project Aria device under diverse indoor and outdoor lighting conditions, and further validate it on a Meta Quest3 device. Across all experiments, we observe a consistent visual enhancement of +1 dB in PSNR by incorporating VIBA, with an additional +1 dB achieved through our proposed image formation model. Our complete implementation, evaluation datasets, and recording profile are available at http://www.projectaria.com/photoreal-reconstruction/
Problem

Research questions and friction points this paper is trying to address.

Reconstructing photorealistic scenes from egocentric devices accurately
Addressing rolling-shutter effects in RGB cameras for precise reconstruction
Incorporating sensor characteristics into Gaussian Splatting for better dynamic range
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses visual-inertial bundle adjustment (VIBA)
Incorporates physical image formation model
Applies Gaussian Splatting for sensor characteristics
🔎 Similar Papers
No similar papers found.