CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

To address insufficient sequence-aware modeling and excessive communication overhead in multi-vehicle collaborative 3D multi-object tracking, this paper proposes CoopTrack, an end-to-end instance-level cooperative tracking framework. Methodologically, CoopTrack integrates a learnable instance association mechanism with sparse instance feature transmission, multi-dimensional semantic–motion feature extraction, and cross-agent feature map association aggregation—enabling adaptive collaborative perception under low-bandwidth constraints. It unifies instance detection, association, and state estimation via a hybrid architecture combining Transformers and graph neural networks. Evaluated on V2X-Seq and Griffin benchmarks, CoopTrack achieves 39.0% mAP and 32.8% AMOTA while significantly reducing communication cost, outperforming existing state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Cooperative perception aims to address the inherent limitations of single-vehicle autonomous driving systems through information exchange among multiple agents. Previous research has primarily focused on single-frame perception tasks. However, the more challenging cooperative sequential perception tasks, such as cooperative 3D multi-object tracking, have not been thoroughly investigated. Therefore, we propose CoopTrack, a fully instance-level end-to-end framework for cooperative tracking, featuring learnable instance association, which fundamentally differs from existing approaches. CoopTrack transmits sparse instance-level features that significantly enhance perception capabilities while maintaining low transmission costs. Furthermore, the framework comprises two key components: Multi-Dimensional Feature Extraction, and Cross-Agent Association and Aggregation, which collectively enable comprehensive instance representation with semantic and motion features, and adaptive cross-agent association and fusion based on a feature graph. Experiments on both the V2X-Seq and Griffin datasets demonstrate that CoopTrack achieves excellent performance. Specifically, it attains state-of-the-art results on V2X-Seq, with 39.0% mAP and 32.8% AMOTA. The project is available at https://github.com/zhongjiaru/CoopTrack.

Problem

Research questions and friction points this paper is trying to address.

Addresses limitations of single-vehicle perception via multi-agent cooperation

Focuses on cooperative sequential perception, not just single-frame tasks

Proposes efficient instance-level tracking with low transmission costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end learning for cooperative sequential perception

Learnable instance association framework

Sparse instance-level feature transmission

🔎 Similar Papers

No similar papers found.