🤖 AI Summary
To address challenges in mobile panoramic image stitching and novel-view synthesis—including depth-disparity mismatches, view-dependent illumination, local motion artifacts, and color inconsistencies—this paper proposes the Spherical Neural Light Field (SNLF). Methodologically, we introduce a novel single-layer Neural Light Sphere architecture that implicitly encodes scenes in spherical coordinates, jointly optimizing camera trajectories and high-resolution reconstruction. We further incorporate view-dependent ray offsets and color decoupling, augmented by test-time optimization. The model is compact (80 MB), avoids voxel sampling, and enables real-time 1080p rendering at 50 FPS. Compared to conventional stitching pipelines and NeRF-based approaches, SNLF achieves superior reconstruction fidelity and significantly improved robustness against camera shake and dynamic objects. To our knowledge, it is the first method enabling implicit panoramic video stitching and wide-field-of-view relighting along arbitrary camera paths.
📝 Abstract
Challenging to capture, and challenging to display on a cellphone screen, the panorama paradoxically remains both a staple and underused feature of modern mobile camera applications. In this work we address both of these challenges with a spherical neural light field model for implicit panoramic image stitching and re-rendering; able to accommodate for depth parallax, view-dependent lighting, and local scene motion and color changes during capture. Fit during test-time to an arbitrary path panoramic video capture -- vertical, horizontal, random-walk -- these neural light spheres jointly estimate the camera path and a high-resolution scene reconstruction to produce novel wide field-of-view projections of the environment. Our single-layer model avoids expensive volumetric sampling, and decomposes the scene into compact view-dependent ray offset and color components, with a total model size of 80 MB per scene, and real-time (50 FPS) rendering at 1080p resolution. We demonstrate improved reconstruction quality over traditional image stitching and radiance field methods, with significantly higher tolerance to scene motion and non-ideal capture settings.