Wid3R: Wide Field-of-View 3D Reconstruction via Camera Model Conditioning

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes Wid3R, a feedforward neural network that enables direct multi-view 3D reconstruction from uncorrected 360° images, eliminating the need for pre-processing or image rectification typically required by existing methods limited to pinhole cameras. The key innovation lies in its unified modeling of diverse wide-angle camera geometries through a novel integration of ray-based representations, spherical harmonic encoding, and learnable camera model tokens, which collectively facilitate distortion-aware reconstruction. Evaluated on the Stanford2D3D dataset, Wid3R substantially outperforms current approaches, achieving a remarkable +77.33 improvement in reconstruction metrics, and demonstrates exceptional zero-shot generalization and robustness across varying camera configurations.

Technology Category

Application Category

📝 Abstract
We present Wid3R, a feed-forward neural network for visual geometry reconstruction that supports wide field-of-view camera models. Prior methods typically assume that input images are rectified or captured with pinhole cameras, since both their architectures and training datasets are tailored to perspective images only. These assumptions limit their applicability in real-world scenarios that use fisheye or panoramic cameras and often require careful calibration and undistortion. In contrast, Wid3R is a generalizable multi-view 3D estimation method that can model wide field-of-view camera types. Our approach leverages a ray representation with spherical harmonics and a novel camera model token within the network, enabling distortion-aware 3D reconstruction. Furthermore, Wid3R is the first multi-view foundation model to support feed-forward 3D reconstruction directly from 360 imagery. It demonstrates strong zero-shot robustness and consistently outperforms prior methods, achieving improvements of up to +77.33 on Stanford2D3D.
Problem

Research questions and friction points this paper is trying to address.

wide field-of-view
3D reconstruction
camera model
fisheye
panoramic
Innovation

Methods, ideas, or system contributions that make the work stand out.

wide field-of-view
camera model conditioning
spherical harmonics
feed-forward 3D reconstruction
multi-view foundation model
🔎 Similar Papers
No similar papers found.