PIS3R: Very Large Parallax Image Stitching via Deep 3D Reconstruction

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

To address pixel misalignment in large-disparity image stitching caused by scene depth variations and wide camera baselines, this paper proposes a novel framework integrating geometric priors with generative modeling. Methodologically, it first employs a vision-geometry-guided Transformer for robust camera parameter estimation and dense 3D reconstruction; then achieves cross-view pixel-level alignment via point cloud reprojection; finally refines stitching boundaries using a point-conditioned image diffusion module that preserves 3D geometric consistency. To our knowledge, this is the first work to jointly leverage explicit 3D reconstruction and conditional diffusion models for large-disparity stitching. Extensive experiments on multiple real-world complex scenes demonstrate substantial improvements over state-of-the-art methods. Both quantitative metrics and qualitative visual results confirm its high accuracy, strong robustness against depth discontinuities and viewpoint changes, and practical applicability in challenging outdoor and indoor environments.

Technology Category

Application Category

📝 Abstract

Image stitching aim to align two images taken from different viewpoints into one seamless, wider image. However, when the 3D scene contains depth variations and the camera baseline is significant, noticeable parallax occurs-meaning the relative positions of scene elements differ substantially between views. Most existing stitching methods struggle to handle such images with large parallax effectively. To address this challenge, in this paper, we propose an image stitching solution called PIS3R that is robust to very large parallax based on the novel concept of deep 3D reconstruction. First, we apply visual geometry grounded transformer to two input images with very large parallax to obtain both intrinsic and extrinsic parameters, as well as the dense 3D scene reconstruction. Subsequently, we reproject reconstructed dense point cloud onto a designated reference view using the recovered camera parameters, achieving pixel-wise alignment and generating an initial stitched image. Finally, to further address potential artifacts such as holes or noise in the initial stitching, we propose a point-conditioned image diffusion module to obtain the refined result.Compared with existing methods, our solution is very large parallax tolerant and also provides results that fully preserve the geometric integrity of all pixels in the 3D photogrammetric context, enabling direct applicability to downstream 3D vision tasks such as SfM. Experimental results demonstrate that the proposed algorithm provides accurate stitching results for images with very large parallax, and outperforms the existing methods qualitatively and quantitatively.

Problem

Research questions and friction points this paper is trying to address.

Handles image stitching with large parallax effectively

Uses deep 3D reconstruction for robust alignment

Preserves geometric integrity for downstream 3D tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep 3D reconstruction for large parallax

Visual geometry grounded transformer usage

Point-conditioned image diffusion refinement

🔎 Similar Papers

No similar papers found.