QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception

📅 2025-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing V2X cooperative perception systems rely heavily on full-precision models, resulting in high computational and communication overhead, substantial latency, and poor real-time deployability in resource-constrained environments. This paper proposes the first end-to-end fully quantized multi-agent framework for vehicle-infrastructure cooperative perception, jointly optimizing model inference and message passing via unified low-bit quantization. It integrates multimodal intermediate feature fusion with neural message passing to preserve perceptual fidelity under extreme compression. Evaluated on standard benchmarks, our method achieves a +9.5 mAP₃₀ gain while reducing system latency by up to 3.2× and significantly lowering bandwidth requirements—enabling large-scale models to run efficiently within limited memory budgets. The core contribution lies in pioneering the application of full quantization to multi-agent cooperative perception, thereby breaking the traditional trade-off between accuracy, efficiency, and scalability in real-world V2X deployments.

Technology Category

Application Category

📝 Abstract
Cooperative perception through Vehicle-to-Everything (V2X) communication offers significant potential for enhancing vehicle perception by mitigating occlusions and expanding the field of view. However, past research has predominantly focused on improving accuracy metrics without addressing the crucial system-level considerations of efficiency, latency, and real-world deployability. Noticeably, most existing systems rely on full-precision models, which incur high computational and transmission costs, making them impractical for real-time operation in resource-constrained environments. In this paper, we introduce extbf{QuantV2X}, the first fully quantized multi-agent system designed specifically for efficient and scalable deployment of multi-modal, multi-agent V2X cooperative perception. QuantV2X introduces a unified end-to-end quantization strategy across both neural network models and transmitted message representations that simultaneously reduces computational load and transmission bandwidth. Remarkably, despite operating under low-bit constraints, QuantV2X achieves accuracy comparable to full-precision systems. More importantly, when evaluated under deployment-oriented metrics, QuantV2X reduces system-level latency by 3.2$ imes$ and achieves a +9.5 improvement in mAP30 over full-precision baselines. Furthermore, QuantV2X scales more effectively, enabling larger and more capable models to fit within strict memory budgets. These results highlight the viability of a fully quantized multi-agent intermediate fusion system for real-world deployment. The system will be publicly released to promote research in this field: https://github.com/ucla-mobility/QuantV2X.
Problem

Research questions and friction points this paper is trying to address.

High computational and transmission costs in V2X perception systems
Lack of efficiency and latency optimization for real-world deployment
Full-precision models impractical for resource-constrained real-time operation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully quantized multi-agent V2X system
End-to-end quantization strategy for efficiency
Low-bit operation with full-precision accuracy
🔎 Similar Papers
No similar papers found.