TTSA3R: Training-Free Temporal-Spatial Adaptive Persistent State for Streaming 3D Reconstruction

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the challenge of catastrophic forgetting in streaming 3D reconstruction over long sequences, which arises from an imbalance between historical states and new observations. To mitigate this issue, the paper introduces the first training-free, spatiotemporally adaptive state update framework that jointly models temporal evolution patterns and spatial context consistency. The approach employs a temporal adaptation module to analyze state dynamics and a spatial context module to assess observation quality, fusing dual signals—state-observation alignment and scene dynamics—to guide robust state updates. Evaluated on extended sequences, the method incurs only a 15% increase in reconstruction error, markedly outperforming baseline approaches that suffer over 200% performance degradation, thereby substantially enhancing reconstruction stability.

Technology Category

Application Category

📝 Abstract

Streaming recurrent models enable efficient 3D reconstruction by maintaining persistent state representations. However, they suffer from catastrophic memory forgetting over long sequences due to balancing historical information with new observations. Recent methods alleviate this by deriving adaptive signals from attention perspective, but they operate on single dimensions without considering temporal and spatial consistency. To this end, we propose a training-free framework termed TTSA3R that leverages both temporal state evolution and spatial observation quality for adaptive state updates in 3D reconstruction. In particular, we devise a Temporal Adaptive Update Module that regulates update magnitude by analyzing temporal state evolution patterns. Then, a Spatial Contextual Update Module is introduced to localize spatial regions that require updates through observation-state alignment and scene dynamics. These complementary signals are finally fused to determine the state updating strategies. Extensive experiments demonstrate the effectiveness of TTSA3R in diverse 3D tasks. Moreover, our method exhibits only 15% error increase compared to over 200% degradation in baseline models on extended sequences, significantly improving long-term reconstruction stability. Our codes will be available soon.

Problem

Research questions and friction points this paper is trying to address.

streaming 3D reconstruction

catastrophic forgetting

temporal consistency

spatial consistency

persistent state

Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free

temporal-spatial adaptation

persistent state