TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address asynchronous multi-agent feature misalignment and semantic inconsistency in vehicle-to-vehicle (V2V) cooperative perception caused by communication latency, this paper proposes a feature-level trajectory modeling framework. It formulates motion compensation as a spatiotemporally continuous attention path, enabling temporally ordered sampling and semantic alignment of historical features for the current query. The core contributions include: (i) the first Transformer-based trajectory-aware attention mechanism; (ii) a differentiable temporal sampling module; and (iii) a cross-frame feature propagation and reconstruction network. This unified approach jointly resolves both spatial and semantic misalignment, significantly improving consistency and real-time performance in asynchronous feature fusion. Extensive experiments demonstrate state-of-the-art performance on the V2V4Real and DAIR-V2X-Seq benchmarks, establishing a new paradigm for asynchronous cooperative perception.

Technology Category

Application Category

📝 Abstract

Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.

Problem

Research questions and friction points this paper is trying to address.

Addresses spatial and semantic misalignments in multi-agent perception

Compensates for inter-agent latency in cooperative perception systems

Ensures semantic consistency across asynchronous agent observations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Predicts feature-level trajectory for alignment

Generates temporally ordered sampling points

Ensures semantic consistency across agents

🔎 Similar Papers

TCAFF: Temporal Consistency for Robot Frame Alignment