MMCD: Multi-Modal Collaborative Decision-Making for Connected Autonomy with Knowledge Distillation

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the robustness deficiency of connected autonomous driving in accident-prone scenarios—caused by limited ego-vehicle perception range, unreliable multimodal sensor data, and unstable vehicle-to-vehicle collaboration—this paper proposes a cross-modal knowledge distillation framework. The method adopts a teacher–student architecture that jointly models RGB images, LiDAR point clouds, and cooperative perception inputs from both the ego-vehicle and collaborating vehicles, enabling comprehensive fusion of heterogeneous multimodal data during training. At inference time, it supports dynamic, robust decision-making under arbitrary modality or collaborator dropout. Its key innovation lies in being the first to apply knowledge distillation to multi-vehicle, multimodal cooperative perception—effectively mitigating performance degradation caused by sensor failures or communication interruptions. Experiments in ground and air-ground collaborative driving scenarios demonstrate a 20.7% improvement in accident warning accuracy and decision safety over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Autonomous systems have advanced significantly, but challenges persist in accident-prone environments where robust decision-making is crucial. A single vehicle's limited sensor range and obstructed views increase the likelihood of accidents. Multi-vehicle connected systems and multi-modal approaches, leveraging RGB images and LiDAR point clouds, have emerged as promising solutions. However, existing methods often assume the availability of all data modalities and connected vehicles during both training and testing, which is impractical due to potential sensor failures or missing connected vehicles. To address these challenges, we introduce a novel framework MMCD (Multi-Modal Collaborative Decision-making) for connected autonomy. Our framework fuses multi-modal observations from ego and collaborative vehicles to enhance decision-making under challenging conditions. To ensure robust performance when certain data modalities are unavailable during testing, we propose an approach based on cross-modal knowledge distillation with a teacher-student model structure. The teacher model is trained with multiple data modalities, while the student model is designed to operate effectively with reduced modalities. In experiments on $ extit{connected autonomous driving with ground vehicles}$ and $ extit{aerial-ground vehicles collaboration}$, our method improves driving safety by up to ${it 20.7}%$, surpassing the best-existing baseline in detecting potential accidents and making safe driving decisions. More information can be found on our website https://ruiiu.github.io/mmcd.
Problem

Research questions and friction points this paper is trying to address.

Enhancing decision-making for autonomous vehicles in accident-prone environments
Addressing sensor limitations and missing data modalities during testing
Improving safety through multi-modal collaborative vehicle systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses multi-modal observations from connected vehicles
Uses cross-modal knowledge distillation for robustness
Employs teacher-student model to handle missing data
🔎 Similar Papers
No similar papers found.
R
Rui Liu
University of Maryland, College Park
Zikang Wang
Zikang Wang
Institute of Automation, Chinese Academy of Sciences
P
Peng Gao
North Carolina State University
Y
Yu Shen
Adobe Research
Pratap Tokekar
Pratap Tokekar
Associate Professor, University of Maryland
Robotics
M
Ming Lin
University of Maryland, College Park