SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop

📅 2025-08-18

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Conventional RGB cameras suffer from motion blur in high-dynamic scenes, while event cameras produce sparse, textureless event streams, hindering robust pixel-level instance segmentation. Method: We propose the first end-to-end multimodal instance segmentation framework for spatiotemporally aligned event–grayscale data. It introduces a joint event–image spatiotemporal encoder that fuses voxelized event sequences with RGB frame temporal features, coupled with a cross-modal attention-driven pixel-wise instance disentanglement mechanism. Contribution/Results: Evaluated on two newly established benchmarks—E2VID-Inst and DSEC-Inst—our method achieves the first event-driven pixel-level instance segmentation, improving mAP by 12.6% over prior approaches while maintaining inference latency below 35 ms. This work advances low-latency, high-robustness dynamic vision perception and establishes a new paradigm for real-time scene understanding in autonomous driving and robotics.

Technology Category

Application Category

📝 Abstract

We present an overview of the Spatio-temporal Instance Segmentation (SIS) challenge held in conjunction with the CVPR 2025 Event-based Vision Workshop. The task is to predict accurate pixel-level segmentation masks of defined object classes from spatio-temporally aligned event camera and grayscale camera data. We provide an overview of the task, dataset, challenge details and results. Furthermore, we describe the methods used by the top-5 ranking teams in the challenge. More resources and code of the participants' methods are available here: https://github.com/tub-rip/MouseSIS/blob/main/docs/challenge_results.md

Problem

Research questions and friction points this paper is trying to address.

Segment objects from event and grayscale camera data

Predict pixel-level masks for defined object classes

Evaluate methods in spatio-temporal instance segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Event-based spatio-temporal instance segmentation

Combined event and grayscale camera data

Top-5 team methods analyzed

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

2026 Summer Intern, PhD, Perception

Waymo

Hourly Masters Pay$70—$70 USD; Hourly PhD Pay$85—$85 USD

Mountain View, CA, USA

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)