๐ค AI Summary
This work addresses a critical vulnerability in existing vehicular collaborative perception systems, which rely on consistency checks to defend against falsified data but employ trust evaluation mechanisms susceptible to exploitation. We propose TrustFlip, the first attack that deploys physical adversarial objects to induce perceptual inconsistencies among benign vehicles, thereby misleading the system into erroneously downgrading the trust score of a target vehicleโwithout injecting any forged data. TrustFlip successfully excludes the target vehicle in 87.7% of scenarios and reduces average perception accuracy by 13%. To counter this threat, we introduce TrustReflect, a lightweight mitigation leveraging an uncertainty-aware introspection mechanism that reduces attack success rates by 35% to 100%, establishing a new attack paradigm wherein physical perturbations indirectly poison trust assessments.
๐ Abstract
Collaborative perception (CP) enables connected and autonomous vehicles to share sensor data and jointly reason about their environment. To defend against adversaries that fabricate or manipulate shared data, existing systems employ cross-vehicle inconsistency detection and trust estimation, penalizing vehicles whose observations conflict with the majority. In this work, we show that these defenses themselves introduce a new attack surface. We present TrustFlip, a novel attack that weaponizes consistency-based defenses to poison the trust assigned to benign vehicles. Instead of injecting false data into the collaboration pipeline, it deploys physical adversarial objects that are genuine but induce inconsistent observations among benign vehicles. The resulting inconsistencies are misattributed by the defense to the targeted vehicle, causing its trust score to degrade and eventually leading to its downweighting or exclusion from collaboration. Consequently, the system loses reliable sensing contributors, degrading perception capability and potentially inducing safety-critical failures. We evaluate TrustFlip across multiple collaborative perception architectures and defense mechanisms. Our results show that state-of-the-art defenses can be significantly affected: the attack removes the targeted benign vehicle from collaboration in up to 87.7% of scenarios and drops Average Precision (AP) by up to 13%. As an initial mitigation, we introduce TrustReflect, a lightweight self-reflection mechanism that marks disputed regions as uncertain and excludes them from trust evaluation, reducing the attack success rate by 35-100%.