VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In complex geometric scenes with self-occlusion, conventional next-best-view (NBV) selection—relying on scene priors or coverage-oriented heuristics—fails to effectively improve 3D reconstruction quality. Method: This paper proposes a resource-constrained, 3D-aware view introspection method that abandons the traditional coverage-maximization paradigm. It introduces the first end-to-end NBV decision framework based on single-view quality-gain prediction: leveraging 3D-aware feature encoding and query-view embedding to model geometric consistency; incorporating an improved fractional decoding mechanism and serialized greedy sampling; and directly regressing quality improvement via imitation learning—without requiring scene priors or backtracking. Contribution/Results: The method enables single-step, online NBV selection under motion-time or acquisition-count constraints. Empirical evaluation shows a ~30% improvement in reconstruction quality over coverage-maximization baselines under equivalent resource budgets.

Technology Category

Application Category

📝 Abstract
Next Best View (NBV) algorithms aim to acquire an optimal set of images using minimal resources, time, or number of captures to enable efficient 3D reconstruction of a scene. Existing approaches often rely on prior scene knowledge or additional image captures and often develop policies that maximize coverage. Yet, for many real scenes with complex geometry and self-occlusions, coverage maximization does not lead to better reconstruction quality directly. In this paper, we propose the View Introspection Network (VIN), which is trained to predict the reconstruction quality improvement of views directly, and the VIN-NBV policy. A greedy sequential sampling-based policy, where at each acquisition step, we sample multiple query views and choose the one with the highest VIN predicted improvement score. We design the VIN to perform 3D-aware featurization of the reconstruction built from prior acquisitions, and for each query view create a feature that can be decoded into an improvement score. We then train the VIN using imitation learning to predict the reconstruction improvement score. We show that VIN-NBV improves reconstruction quality by ~30% over a coverage maximization baseline when operating with constraints on the number of acquisitions or the time in motion.
Problem

Research questions and friction points this paper is trying to address.

Optimizing view selection for efficient 3D reconstruction
Predicting reconstruction quality improvement directly
Reducing resource usage in complex scene capture
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses View Introspection Network for quality prediction
Implements greedy sequential sampling-based policy
Trains with imitation learning for improvement scores
🔎 Similar Papers
No similar papers found.