🤖 AI Summary
Existing implicit deformation fields suffer from spatial incoherence and poor temporal interpolation capability for dense point trajectory modeling; both neural inductive biases and heuristic explicit methods (e.g., linear blend skinning) struggle to balance interpretability and spatiotemporal consistency. This paper proposes a spline-based explicit trajectory representation: deformation is parameterized via controllable control points, while low-rank time-varying spatial encoding enables decoupled spatiotemporal modeling. The formulation supports analytical computation of velocity and acceleration without imposing rigidity or skinning constraints. Our method significantly improves temporal interpolation accuracy under sparse inputs and enhances motion coherence in dynamic scene reconstruction. It achieves performance competitive with state-of-the-art methods across multiple metrics, while offering superior model interpretability and geometric consistency.
📝 Abstract
Trajectory modeling of dense points usually employs implicit deformation fields, represented as neural networks that map coordinates to relate canonical spatial positions to temporal offsets. However, the inductive biases inherent in neural networks can hinder spatial coherence in ill-posed scenarios. Current methods focus either on enhancing encoding strategies for deformation fields, often resulting in opaque and less intuitive models, or adopt explicit techniques like linear blend skinning, which rely on heuristic-based node initialization. Additionally, the potential of implicit representations for interpolating sparse temporal signals remains under-explored. To address these challenges, we propose a spline-based trajectory representation, where the number of knots explicitly determines the degrees of freedom. This approach enables efficient analytical derivation of velocities, preserving spatial coherence and accelerations, while mitigating temporal fluctuations. To model knot characteristics in both spatial and temporal domains, we introduce a novel low-rank time-variant spatial encoding, replacing conventional coupled spatiotemporal techniques. Our method demonstrates superior performance in temporal interpolation for fitting continuous fields with sparse inputs. Furthermore, it achieves competitive dynamic scene reconstruction quality compared to state-of-the-art methods while enhancing motion coherence without relying on linear blend skinning or as-rigid-as-possible constraints.