TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address asynchronous multi-agent feature misalignment and semantic inconsistency in vehicle-to-vehicle (V2V) cooperative perception caused by communication latency, this paper proposes a feature-level trajectory modeling framework. It formulates motion compensation as a spatiotemporally continuous attention path, enabling temporally ordered sampling and semantic alignment of historical features for the current query. The core contributions include: (i) the first Transformer-based trajectory-aware attention mechanism; (ii) a differentiable temporal sampling module; and (iii) a cross-frame feature propagation and reconstruction network. This unified approach jointly resolves both spatial and semantic misalignment, significantly improving consistency and real-time performance in asynchronous feature fusion. Extensive experiments demonstrate state-of-the-art performance on the V2V4Real and DAIR-V2X-Seq benchmarks, establishing a new paradigm for asynchronous cooperative perception.

Technology Category

Application Category

📝 Abstract
Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.
Problem

Research questions and friction points this paper is trying to address.

Addresses spatial and semantic misalignments in multi-agent perception
Compensates for inter-agent latency in cooperative perception systems
Ensures semantic consistency across asynchronous agent observations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Predicts feature-level trajectory for alignment
Generates temporally ordered sampling points
Ensures semantic consistency across agents
🔎 Similar Papers
No similar papers found.
Zhiying Song
Zhiying Song
Tsinghua University
Cooperative Perception
L
Lei Yang
School of Vehicle and Mobility, Tsinghua University
Fuxi Wen
Fuxi Wen
Associate Professor (research-track), Tsinghua University
Multi-Agent SystemsVehicular CommunicationsCooperative Perception
J
Jun Li
School of Vehicle and Mobility, Tsinghua University; State Key Lab of Intelligent Green Vehicle and Mobility