Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Motivated by the urgent demands of AR/VR and digital twin applications for fast, generalizable, and deployment-friendly 3D reconstruction and novel view synthesis, this paper presents a systematic survey of feedforward deep learning methods—covering dominant representations including point clouds, 3D Gaussian splatting, and neural radiance fields—and focuses on three key challenges: pose-free input, dynamic scene modeling, and 3D-aware content generation. We propose the first unified taxonomy tailored to the feedforward paradigm, revealing inherent trade-offs between inference efficiency and cross-scene generalization. By integrating self-supervised learning, differentiable rendering, and multimodal input strategies—and leveraging standardized evaluation protocols and large-scale benchmarks—we comprehensively assess accuracy, latency, and robustness. Our analysis provides principled guidance and empirically grounded technology selection criteria for industrial-grade 3D vision systems.

Technology Category

Application Category

📝 Abstract
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins. Traditional methods rely on computationally intensive iterative optimization in a complex chain, limiting their applicability in real-world scenarios. Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis. This survey offers a comprehensive review of feed-forward techniques for 3D reconstruction and view synthesis, with a taxonomy according to the underlying representation architectures including point cloud, 3D Gaussian Splatting (3DGS), Neural Radiance Fields (NeRF), etc. We examine key tasks such as pose-free reconstruction, dynamic 3D reconstruction, and 3D-aware image and video synthesis, highlighting their applications in digital humans, SLAM, robotics, and beyond. In addition, we review commonly used datasets with detailed statistics, along with evaluation protocols for various downstream tasks. We conclude by discussing open research challenges and promising directions for future work, emphasizing the potential of feed-forward approaches to advance the state of the art in 3D vision.
Problem

Research questions and friction points this paper is trying to address.

Surveying feed-forward 3D reconstruction and view synthesis techniques
Addressing limitations of traditional iterative optimization methods
Exploring applications in AR, VR, and digital humans
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning enables fast feed-forward 3D reconstruction
Taxonomy based on point cloud and NeRF architectures
Pose-free and dynamic 3D reconstruction applications
🔎 Similar Papers
No similar papers found.