EV-LayerSegNet: Self-supervised Motion Segmentation using Event Cameras

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Event-camera-based motion segmentation traditionally relies on costly and error-prone ground-truth annotations. To address this, this paper proposes an unsupervised hierarchical dynamic representation framework: under the affine motion assumption, it jointly learns pixel-wise affine optical flow and motion segmentation masks in a decoupled manner; it further introduces a novel self-supervised loss based on event deblurring quality, leveraging voxel reconstruction consistency to drive end-to-end optimization. The method eliminates dependence on ground-truth segmentation labels or optical flow, instead achieving hierarchical scene modeling via CNNs. Evaluated on a synthetic dataset, it achieves 71% IoU and 87% motion-object detection rate—marking a significant advance in the practicality and robustness of unsupervised motion understanding from event cameras.

Technology Category

Application Category

📝 Abstract
Event cameras are novel bio-inspired sensors that capture motion dynamics with much higher temporal resolution than traditional cameras, since pixels react asynchronously to brightness changes. They are therefore better suited for tasks involving motion such as motion segmentation. However, training event-based networks still represents a difficult challenge, as obtaining ground truth is very expensive, error-prone and limited in frequency. In this article, we introduce EV-LayerSegNet, a self-supervised CNN for event-based motion segmentation. Inspired by a layered representation of the scene dynamics, we show that it is possible to learn affine optical flow and segmentation masks separately, and use them to deblur the input events. The deblurring quality is then measured and used as self-supervised learning loss. We train and test the network on a simulated dataset with only affine motion, achieving IoU and detection rate up to 71% and 87% respectively.
Problem

Research questions and friction points this paper is trying to address.

Self-supervised motion segmentation for event cameras
Overcoming expensive ground truth limitations in event-based networks
Learning affine optical flow and segmentation masks separately
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised CNN for event-based motion segmentation
Learns affine optical flow and segmentation separately
Uses deblurring quality as self-supervised loss
🔎 Similar Papers
No similar papers found.
Y
Youssef Farah
Advanced Research and Innovation Center, Khalifa University
Federico Paredes-Vallés
Federico Paredes-Vallés
PhD; Senior Research Engineer at Sony
Artificial IntelligenceNeuromorphic ComputingComputer VisionAerial Robotics
G
G. D. Croon
MA VLab, TU Delft
M
M. Humais
Advanced Research and Innovation Center, Khalifa University
H
Hussain Sajwani
Advanced Research and Innovation Center, Khalifa University
Y
Yahya H. Zweiri
Advanced Research and Innovation Center, Khalifa University