🤖 AI Summary
This work proposes a real-time motion-aware event suppression framework to address redundant event interference in event cameras caused by independently moving objects (IMOs) and ego-motion. The method jointly performs semantic segmentation of event streams and predicts future motion trajectories of IMOs, enabling proactive suppression of dynamic events for the first time. Leveraging a lightweight network architecture combined with Vision Transformer token pruning, the approach achieves significant latency reduction while maintaining high accuracy. On the EVIMO benchmark, it improves segmentation accuracy by 67% and achieves an inference speed of 173 Hz. Furthermore, the proposed ViT acceleration yields an 83% speedup and reduces the absolute trajectory error (ATE) of visual odometry by 13%.
📝 Abstract
In this work, we introduce the first framework for Motion-aware Event Suppression, which learns to filter events triggered by IMOs and ego-motion in real time. Our model jointly segments IMOs in the current event stream while predicting their future motion, enabling anticipatory suppression of dynamic events before they occur. Our lightweight architecture achieves 173 Hz inference on consumer-grade GPUs with less than 1 GB of memory usage, outperforming previous state-of-the-art methods on the challenging EVIMO benchmark by 67\% in segmentation accuracy while operating at a 53\% higher inference rate. Moreover, we demonstrate significant benefits for downstream applications: our method accelerates Vision Transformer inference by 83\% via token pruning and improves event-based visual odometry accuracy, reducing Absolute Trajectory Error (ATE) by 13\%.