🤖 AI Summary
This work addresses the scarcity of ground-truth annotations for continuous-time optical flow estimation with event cameras and the neglect of temporal continuity and structural consistency in existing contrast maximization approaches. To overcome these limitations, the authors propose a hybrid supervised framework based on spatio-temporal structural consistency (STSC), which jointly enforces local structural stability and trajectory continuity. By integrating a bidirectional complementary multi-scale architecture with a curriculum-guided training strategy, the method enables a smooth transition from sparse supervision to self-supervised manifold regularization. This approach departs from conventional contrast maximization paradigms and achieves state-of-the-art performance on multiple benchmarks for both continuous-time and standard optical flow estimation, significantly enhancing physical coherence under complex motion scenarios.
📝 Abstract
Estimating continuous optical flow is a fundamental yet challenging problem in dynamic visual perception. Event-based cameras, with microsecond latency and high dynamic range, capture brightness changes asynchronously, offering a unique opportunity to model motion with fine temporal precision. However, the scarcity of temporally dense ground-truth annotations limits the effectiveness of supervised learning, while contrast maximization (CM) frameworks, focused on sharpening the Image of Warped Events (IWE), often neglect temporal continuity and structural coherence, leading to distorted trajectories under complex motion. To overcome these challenges, we propose a hybrid-supervised framework for continuous-time optical flow estimation, grounded in the principle of Spatio-temporal Structural Consistency (STSC). This paradigm jointly enforces local structural stability and trajectory continuity, ensuring physically coherent motion across time. To further enhance representation and robustness, we design a bidirectionally complementary multi-scale architecture and employ a curriculum-guided hybrid training strategy, enabling a smooth transition from supervised point constraints to self-supervised manifold regularization. Comprehensive experiments across multiple benchmarks show that our method achieves state-of-the-art performance in both continuous-time and standard optical flow estimation, demonstrating the effectiveness of the proposed learning paradigm.