🤖 AI Summary
Critical modality missing frequently causes severe performance degradation in multimodal learning. To address this, we propose a meta-learning-driven modality-weighted knowledge distillation (MWKD) framework that adaptively estimates modality importance via meta-learning and guides pairwise weighted cross-modal knowledge transfer, achieving task-agnostic robustness for both classification and segmentation. MWKD is the first to deeply integrate meta-learning with modality-weighted distillation, requiring no architectural modifications to downstream task-specific models and supporting end-to-end training. Extensive experiments on five benchmark datasets—including BraTS2018–2020, ADNI, and Audiovision-MNIST—demonstrate consistent and significant improvements over state-of-the-art methods, especially under single-modality missing scenarios, where it maintains high accuracy. The implementation is publicly available.
📝 Abstract
In multi-modal learning, some modalities are more influential than others, and their absence can have a significant impact on classification/segmentation accuracy. Addressing this challenge, we propose a novel approach called Meta-learned Modality-weighted Knowledge Distillation (MetaKD), which enables multi-modal models to maintain high accuracy even when key modalities are missing. MetaKD adaptively estimates the importance weight of each modality through a meta-learning process. These learned importance weights guide a pairwise modality-weighted knowledge distillation process, allowing high-importance modalities to transfer knowledge to lower-importance ones, resulting in robust performance despite missing inputs. Unlike previous methods in the field, which are often task-specific and require significant modifications, our approach is designed to work in multiple tasks (e.g., segmentation and classification) with minimal adaptation. Experimental results on five prevalent datasets, including three Brain Tumor Segmentation datasets (BraTS2018, BraTS2019 and BraTS2020), the Alzheimer's Disease Neuroimaging Initiative (ADNI) classification dataset and the Audiovision-MNIST classification dataset, demonstrate the proposed model is able to outperform the compared models by a large margin. The code is available at https://github.com/billhhh/MetaKD.