JigsawComm: Joint Semantic Feature Encoding and Transmission for Communication-Efficient Cooperative Perception

๐Ÿ“… 2025-11-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the critical challenges of limited communication bandwidth and the neglect of semantic relevance and cross-agent feature redundancy in multi-agent collaborative perception (CP), this paper proposes the first end-to-end, semantic-aware CP framework. Our method jointly optimizes semantic feature encoding and transmission by introducing a learnable feature utility estimator and a distributed meta-utility graph, enabling theoretically provable optimal feature selection. Furthermore, we employ a regularized encoder coupled with semantic sparse coding to ensure communication overhead remains independent of the number of agents. Evaluated on the OPV2V and DAIR-V2X benchmarks, our framework achieves over 500ร— data compression while matching or surpassing state-of-the-art perception accuracy. This significantly improves bandwidth efficiency and system scalabilityโ€”key bottlenecks in real-world deployment of cooperative autonomous systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Multi-agent cooperative perception (CP) promises to overcome the inherent occlusion and sensing-range limitations of single-agent systems (e.g., autonomous driving). However, its practicality is severely constrained by the limited communication bandwidth. Existing approaches attempt to improve bandwidth efficiency via compression or heuristic message selection, without considering the semantic relevance or cross-agent redundancy of sensory data. We argue that a practical CP system must maximize the contribution of every transmitted bit to the final perception task, by extracting and transmitting semantically essential and non-redundant data. In this paper, we formulate a joint semantic feature encoding and transmission problem, which aims to maximize CP accuracy under limited bandwidth. To solve this problem, we introduce JigsawComm, an end-to-end trained, semantic-aware, and communication-efficient CP framework that learns to ``assemble the puzzle'' of multi-agent feature transmission. It uses a regularized encoder to extract semantically-relevant and sparse features, and a lightweight Feature Utility Estimator to predict the contribution of each agent's features to the final perception task. The resulting meta utility maps are exchanged among agents and leveraged to compute a provably optimal transmission policy, which selects features from agents with the highest utility score for each location. This policy inherently eliminates redundancy and achieves a scalable $mathcal{O}(1)$ communication cost as the number of agents increases. On the benchmarks OPV2V and DAIR-V2X, JigsawComm reduces the total data volume by up to $>$500$ imes$ while achieving matching or superior accuracy compared to state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Maximizing cooperative perception accuracy under limited bandwidth constraints
Eliminating semantic redundancy in multi-agent sensory data transmission
Developing communication-efficient feature encoding for scalable perception systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint semantic feature encoding and transmission optimization
Feature utility estimator for cross-agent redundancy elimination
Scalable transmission policy with constant communication cost
๐Ÿ”Ž Similar Papers
No similar papers found.