GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Motion segmentation in dynamic scenes is often hindered by noisy motion cues and the cumulative errors and high computational costs associated with multi-stage pipelines. This work proposes an end-to-end fully learned approach that, for the first time, integrates 4D geometric reconstruction priors—such as π³—with attention mechanisms to implicitly disentangle object and camera motion. By directly inferring moving objects from latent features, the method eliminates the need for explicit correspondence estimation or iterative optimization. Departing from conventional multi-stage frameworks, it introduces spatiotemporal priors into the learning process, achieving state-of-the-art motion segmentation performance across multiple benchmarks while significantly improving inference efficiency by circumventing complex preprocessing and optimization steps.

Technology Category

Application Category

📝 Abstract
Motion segmentation in dynamic scenes is highly challenging, as conventional methods heavily rely on estimating camera poses and point correspondences from inherently noisy motion cues. Existing statistical inference or iterative optimization techniques that struggle to mitigate the cumulative errors in multi-stage pipelines often lead to limited performance or high computational cost. In contrast, we propose a fully learning-based approach that directly infers moving objects from latent feature representations via attention mechanisms, thus enabling end-to-end feed-forward motion segmentation. Our key insight is to bypass explicit correspondence estimation and instead let the model learn to implicitly disentangle object and camera motion. Supported by recent advances in 4D scene geometry reconstruction (e.g., $π^3$), the proposed method leverages reliable camera poses and rich spatial-temporal priors, which ensure stable training and robust inference for the model. Extensive experiments demonstrate that by eliminating complex pre-processing and iterative refinement, our approach achieves state-of-the-art motion segmentation performance with high efficiency. The code is available at:https://github.com/zjutcvg/GeoMotion.
Problem

Research questions and friction points this paper is trying to address.

motion segmentation
dynamic scenes
camera pose estimation
point correspondences
cumulative errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

motion segmentation
latent 4D geometry
attention mechanism
end-to-end learning
camera pose priors
🔎 Similar Papers
No similar papers found.
Xiankang He
Xiankang He
Zhejiang University of Technology
P
Peile Lin
College of Computer Science and Technology, Zhejiang University of Technology; Zhejiang Key Laboratory of Visual Information Intelligent Processing
Ying Cui
Ying Cui
Zhejiang University of Technology
Computer Vision
Dongyan Guo
Dongyan Guo
Zhejiang University of Technology
Chunhua Shen
Chunhua Shen
Zhejiang University
Computer VisionMachine Learning
X
Xiaoqin Zhang
College of Computer Science and Technology, Zhejiang University of Technology; Zhejiang Key Laboratory of Visual Information Intelligent Processing