tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the challenge of efficient, high-quality 3D reconstruction under long-context and autoregressive conditions by proposing tttLRM, the first method to integrate test-time training (TTT) into large-scale 3D reconstruction. tttLRM employs TTT layers to compress a stream of multi-view images into an implicit 3D latent representation, enabling progressive reconstruction through online learning and subsequent decoding into explicit formats such as 3D Gaussian splatting. The approach maintains linear computational complexity while supporting long-sequence modeling and autoregressive generation, effectively bridging the gap between pre-trained novel view synthesis and explicit 3D modeling. Experiments demonstrate that tttLRM significantly outperforms existing methods in feedforward 3D Gaussian reconstruction for both objects and scenes, achieving higher reconstruction fidelity and faster convergence.

Technology Category

Application Category

📝 Abstract

We propose tttLRM, a novel large 3D reconstruction model that leverages a Test-Time Training (TTT) layer to enable long-context, autoregressive 3D reconstruction with linear computational complexity, further scaling the model's capability. Our framework efficiently compresses multiple image observations into the fast weights of the TTT layer, forming an implicit 3D representation in the latent space that can be decoded into various explicit formats, such as Gaussian Splats (GS) for downstream applications. The online learning variant of our model supports progressive 3D reconstruction and refinement from streaming observations. We demonstrate that pretraining on novel view synthesis tasks effectively transfers to explicit 3D modeling, resulting in improved reconstruction quality and faster convergence. Extensive experiments show that our method achieves superior performance in feedforward 3D Gaussian reconstruction compared to state-of-the-art approaches on both objects and scenes.

Problem

Research questions and friction points this paper is trying to address.

3D reconstruction

long context

autoregressive modeling

test-time training

streaming observations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Test-Time Training

Autoregressive 3D Reconstruction

Linear Complexity