Free Geometry: Refining 3D Reconstruction from Longer Versions of Itself

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the limited adaptability of existing feed-forward 3D reconstruction models to novel scenes at test time, where occlusions and ambiguous cues often lead to geometric inconsistencies. The authors propose the first test-time self-evolution framework that operates without ground-truth 3D supervision. By leveraging multi-view inputs to formulate a self-supervised learning task, the method enforces cross-view feature consistency through masked frames and employs LoRA for lightweight parameter updates. Requiring only a single GPU, the approach enables efficient model fine-tuning and achieves consistent improvements across four benchmark datasets, boosting camera pose accuracy by 3.73% and point map prediction accuracy by 2.88% on average. These gains significantly enhance the reconstruction consistency and fidelity of state-of-the-art models such as Depth Anything 3 and VGGT.

Technology Category

Application Category

📝 Abstract

Feed-forward 3D reconstruction models are efficient but rigid: once trained, they perform inference in a zero-shot manner and cannot adapt to the test scene. As a result, visually plausible reconstructions often contain errors, particularly under occlusions, specularities, and ambiguous cues. To address this, we introduce Free Geometry, a framework that enables feed-forward 3D reconstruction models to self-evolve at test time without any 3D ground truth. Our key insight is that, when the model receives more views, it produces more reliable and view-consistent reconstructions. Leveraging this property, given a testing sequence, we mask a subset of frames to construct a self-supervised task. Free Geometry enforces cross-view feature consistency between representations from full and partial observations, while maintaining the pairwise relations implied by the held-out frames. This self-supervision allows for fast recalibration via lightweight LoRA updates, taking less than 2 minutes per dataset on a single GPU. Our approach consistently improves state-of-the-art foundation models, including Depth Anything 3 and VGGT, across 4 benchmark datasets, yielding an average improvement of 3.73% in camera pose accuracy and 2.88% in point map prediction. Code is available at https://github.com/hiteacherIamhumble/Free-Geometry .

Problem

Research questions and friction points this paper is trying to address.

3D reconstruction

feed-forward models

test-time adaptation

view consistency

self-supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Free Geometry

self-supervised 3D reconstruction

test-time adaptation