Rethinking Event-Based Object Dtection through Representation-Level Temporal Aggregation and Model-Level Hypergraph Reasoning

πŸ“… 2026-05-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

200K/year
πŸ€– AI Summary
This work addresses the challenge that existing event-based object detection methods struggle to efficiently encode temporal information and aggregate fragmented events into high-order object representations. To overcome this limitation, the authors propose Ev-DTAD, a unified framework that innovatively integrates hierarchical temporal aggregation (HTA) to generate compact three-channel pseudo-RGB representations and introduces frequency-aware hypergraph temporal fusion (FHTF) for modeling high-order relationships and enhancing temporal evolution. Evaluated on the Gen1, 1Mpx/Gen4, and eTraM datasets, the method achieves mAP improvements of 0.8, 0.5, and 3.0, respectively, while accelerating inference by 1.6–2.0Γ—, thereby striking a notable balance between accuracy and efficiency.
πŸ“ Abstract
Event cameras provide microsecond-level temporal resolution, low latency, and high dynamic range, offering potential for perception under fast motion and challenging illumination conditions. However, existing Event-based Object Detection (EOD) methods face limitations at both the representation and model levels: prior event representations usually encode temporal information indirectly through redundant structures, while detection models struggle to explicitly aggregate fragmented event responses into coherent high-order object features. To address these limitations, we present Event Dual Temporal-Relational Aggregation Detector (Ev-DTAD), a unified EOD framework that integrates representation-level temporal encoding with model-level temporal-hypergraph reasoning. Specifically, we introduce Hierarchical Temporal Aggregation (HTA), a compact three-channel pseudo-RGB representation that explicitly embeds temporal information across intra- and inter-window events. To further enhance detection under sparse and fragmented event responses, we propose Frequency-aware Hypergraph Temporal Fusion (FHTF), which refines multi-scale event features through temporal evolution modeling and high-order relational reasoning. Extensive experiments on Gen1 (+0.8 mAP and 1.7$\times$ faster), 1Mpx/Gen4 (+0.5 mAP and 1.6$\times$ faster), and eTraM (+3.0 mAP and \textbf{2.0$\times$ faster}) demonstrate that Ev-DTAD achieves a competitive accuracy-efficiency trade-off, validating the complementarity between compact temporal representation and temporal-hypergraph feature reasoning.
Problem

Research questions and friction points this paper is trying to address.

Event-based Object Detection
Temporal Aggregation
Hypergraph Reasoning
Event Representation
Fragmented Event Responses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Aggregation
Hypergraph Reasoning
Event-based Object Detection
Compact Representation
High-order Relational Modeling
πŸ”Ž Similar Papers