VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of novel view synthesis in autonomous driving, where trajectories beyond the original camera paths lack ground-truth supervision. To tackle this, the authors propose a self-supervised image inpainting framework that reformulates view extrapolation as an interpolation-based inpainting task via a virtual displacement strategy, enabling pixel-level supervision from the original images. The method explicitly models photometric discrepancies and calibration errors across multiple cameras through pseudo-3D seam blending. By integrating monocular depth priors, self-supervised inpainting, and multi-view consistency constraints—all without requiring LiDAR supervision—the approach significantly enhances geometric fidelity and visual quality, enabling scalable, high-fidelity driving simulation.

Technology Category

Application Category

📝 Abstract
A fundamental bottleneck in Novel View Synthesis (NVS) for autonomous driving is the inherent supervision gap on novel trajectories: models are tasked with synthesizing unseen views during inference, yet lack ground truth images for these shifted poses during training. In this paper, we propose VisionNVS, a camera-only framework that fundamentally reformulates view synthesis from an ill-posed extrapolation problem into a self-supervised inpainting task. By introducing a ``Virtual-Shift'' strategy, we use monocular depth proxies to simulate occlusion patterns and map them onto the original view. This paradigm shift allows the use of raw, recorded images as pixel-perfect supervision, effectively eliminating the domain gap inherent in previous approaches. Furthermore, we address spatial consistency through a Pseudo-3D Seam Synthesis strategy, which integrates visual data from adjacent cameras during training to explicitly model real-world photometric discrepancies and calibration errors. Experiments demonstrate that VisionNVS achieves superior geometric fidelity and visual quality compared to LiDAR-dependent baselines, offering a robust solution for scalable driving simulation.
Problem

Research questions and friction points this paper is trying to address.

Novel View Synthesis
supervision gap
autonomous driving
view extrapolation
occlusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Virtual-Shift
Self-Supervised Inpainting
Novel View Synthesis
Pseudo-3D Seam Synthesis
Camera-only NVS
🔎 Similar Papers
No similar papers found.
H
Hongbo Lu
Shanghai Jiao Tong University and COW ARobot Co. Ltd.
L
Liang Yao
Hohai University
C
Chenghao He
COW ARobot Co. Ltd.
Fan Liu
Fan Liu
Hohai University
computer vision
Wenlong Liao
Wenlong Liao
COWAROBOT
RoboticsAI
Tao He
Tao He
GRG Banking Equipment Co., Ltd.
P
Pai Peng
COW ARobot Co. Ltd.