🤖 AI Summary
To address the challenge of training optical flow estimation models under hardware resource constraints, this paper proposes FlowSeek—a lightweight framework that synergistically integrates a single-image depth foundation model with a low-dimensional motion basis parameterization, enabling joint optimization of feature learning and motion modeling. FlowSeek employs a compact network architecture, depth-guided feature extraction, and low-rank decomposition of motion bases to significantly reduce computational overhead. Compared to SEA-RAFT, FlowSeek achieves 10% and 15% lower endpoint errors on the Sintel Final and KITTI benchmarks, respectively, and attains state-of-the-art performance on cross-domain datasets including Spring and LayeredFlow. Crucially, FlowSeek reduces training hardware requirements by an order of magnitude (8×), markedly improving generalizability and deployment efficiency for resource-constrained environments.
📝 Abstract
We present FlowSeek, a novel framework for optical flow requiring minimal hardware resources for training. FlowSeek marries the latest advances on the design space of optical flow networks with cutting-edge single-image depth foundation models and classical low-dimensional motion parametrization, implementing a compact, yet accurate architecture. FlowSeek is trained on a single consumer-grade GPU, a hardware budget about 8x lower compared to most recent methods, and still achieves superior cross-dataset generalization on Sintel Final and KITTI, with a relative improvement of 10 and 15% over the previous state-of-the-art SEA-RAFT, as well as on Spring and LayeredFlow datasets.