SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional RGB cameras suffer from motion blur in high-dynamic scenes, while event cameras produce sparse, textureless event streams, hindering robust pixel-level instance segmentation. Method: We propose the first end-to-end multimodal instance segmentation framework for spatiotemporally aligned event–grayscale data. It introduces a joint event–image spatiotemporal encoder that fuses voxelized event sequences with RGB frame temporal features, coupled with a cross-modal attention-driven pixel-wise instance disentanglement mechanism. Contribution/Results: Evaluated on two newly established benchmarks—E2VID-Inst and DSEC-Inst—our method achieves the first event-driven pixel-level instance segmentation, improving mAP by 12.6% over prior approaches while maintaining inference latency below 35 ms. This work advances low-latency, high-robustness dynamic vision perception and establishes a new paradigm for real-time scene understanding in autonomous driving and robotics.

Technology Category

Application Category

📝 Abstract
We present an overview of the Spatio-temporal Instance Segmentation (SIS) challenge held in conjunction with the CVPR 2025 Event-based Vision Workshop. The task is to predict accurate pixel-level segmentation masks of defined object classes from spatio-temporally aligned event camera and grayscale camera data. We provide an overview of the task, dataset, challenge details and results. Furthermore, we describe the methods used by the top-5 ranking teams in the challenge. More resources and code of the participants' methods are available here: https://github.com/tub-rip/MouseSIS/blob/main/docs/challenge_results.md
Problem

Research questions and friction points this paper is trying to address.

Segment objects from event and grayscale camera data
Predict pixel-level masks for defined object classes
Evaluate methods in spatio-temporal instance segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Event-based spatio-temporal instance segmentation
Combined event and grayscale camera data
Top-5 team methods analyzed
🔎 Similar Papers
No similar papers found.
Friedhelm Hamann
Friedhelm Hamann
TU Berlin
Event-based VisionTrackingScene UnderstandingComputer Vision
E
Emil Mededovic
RWTH Aachen
F
Fabian Gülhan
RWTH Aachen
Yuli Wu
Yuli Wu
RWTH Aachen University
Computer VisionRetinal Prosthesis
Johannes Stegmaier
Johannes Stegmaier
RWTH Aachen University
3D+t Image AnalysisMachine LearningMicroscopyDevelopmental BiologyMedical Image Analysis
J
Jing He
Xidian University
Y
Yiqing Wang
Xidian University
Kexin Zhang
Kexin Zhang
Tsinghua University
Data MiningMachine Learning
Lingling Li
Lingling Li
Associate Director of Biostatistics, Sanofi Genzyme
Causal inferencemissing datapropensity scoresequential analytic methodsdrug and vaccine safety
Licheng Jiao
Licheng Jiao
Distinguished Professor of Xidian University, IEEE Fellow
Neural NetworksComputational IntelligenceEvolutionary ComputationRemote SensingPattern Recognition.
Mengru Ma
Mengru Ma
xidian university
Fusion Classification,Remote Sensing Intelligent Interpretation
H
Hongxiang Huang
Hong Kong University of Science and Technology
Y
Yuhao Yan
Sun Yat-sen University
H
Hongwei Ren
Hong Kong University of Science and Technology
X
Xiaopeng Lin
Hong Kong University of Science and Technology
Yulong Huang
Yulong Huang
Hong Kong University of Science and Technology
B
Bojun Cheng
Hong Kong University of Science and Technology
S
Se Hyun Lee
Wonkwang University
G
Gyu Sung Ham
Wonkwang University
K
Kanghan Oh
Wonkwang University
Gi Hyun Lim
Gi Hyun Lim
Wonkwang University
Cognitive RoboticsArtificial IntelligenceMachine Learning
B
Boxuan Yang
Tongji University
Bowen Du
Bowen Du
Beihang University
G
Guillermo Gallego
TU Berlin, SCIoI, ECDF