VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm

📅 2026-03-18

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of novel view synthesis in autonomous driving, where trajectories beyond the original camera paths lack ground-truth supervision. To tackle this, the authors propose a self-supervised image inpainting framework that reformulates view extrapolation as an interpolation-based inpainting task via a virtual displacement strategy, enabling pixel-level supervision from the original images. The method explicitly models photometric discrepancies and calibration errors across multiple cameras through pseudo-3D seam blending. By integrating monocular depth priors, self-supervised inpainting, and multi-view consistency constraints—all without requiring LiDAR supervision—the approach significantly enhances geometric fidelity and visual quality, enabling scalable, high-fidelity driving simulation.

Technology Category

Application Category

📝 Abstract

A fundamental bottleneck in Novel View Synthesis (NVS) for autonomous driving is the inherent supervision gap on novel trajectories: models are tasked with synthesizing unseen views during inference, yet lack ground truth images for these shifted poses during training. In this paper, we propose VisionNVS, a camera-only framework that fundamentally reformulates view synthesis from an ill-posed extrapolation problem into a self-supervised inpainting task. By introducing a ``Virtual-Shift'' strategy, we use monocular depth proxies to simulate occlusion patterns and map them onto the original view. This paradigm shift allows the use of raw, recorded images as pixel-perfect supervision, effectively eliminating the domain gap inherent in previous approaches. Furthermore, we address spatial consistency through a Pseudo-3D Seam Synthesis strategy, which integrates visual data from adjacent cameras during training to explicitly model real-world photometric discrepancies and calibration errors. Experiments demonstrate that VisionNVS achieves superior geometric fidelity and visual quality compared to LiDAR-dependent baselines, offering a robust solution for scalable driving simulation.

Problem

Research questions and friction points this paper is trying to address.

Novel View Synthesis

supervision gap

autonomous driving

view extrapolation

occlusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Virtual-Shift

Self-Supervised Inpainting

Novel View Synthesis