🤖 AI Summary
To address the deployment challenges of large models in medical image classification under computational resource constraints, this paper proposes a lightweight framework integrating dual-model weight selection and self-knowledge distillation. The method initializes two compact backbone networks with weights from a pretrained large model; a dynamic weight selection mechanism preserves critical feature representations, while self-knowledge distillation enables intra-model knowledge transfer and joint optimization. Extensive experiments on multimodal public benchmarks—including chest X-ray, lung CT, and brain MRI datasets—demonstrate that the proposed approach significantly outperforms existing lightweight models. It achieves classification accuracy approaching that of its large-model counterpart while reducing parameter count by over 80% (<1/5 parameters) and maintaining low computational overhead. Moreover, the method exhibits enhanced robustness to distribution shifts and superior cross-dataset generalization capability.
📝 Abstract
We propose a novel medical image classification method that integrates dual-model weight selection with self-knowledge distillation (SKD). In real-world medical settings, deploying large-scale models is often limited by computational resource constraints, which pose significant challenges for their practical implementation. Thus, developing lightweight models that achieve comparable performance to large-scale models while maintaining computational efficiency is crucial. To address this, we employ a dual-model weight selection strategy that initializes two lightweight models with weights derived from a large pretrained model, enabling effective knowledge transfer. Next, SKD is applied to these selected models, allowing the use of a broad range of initial weight configurations without imposing additional excessive computational cost, followed by fine-tuning for the target classification tasks. By combining dual-model weight selection with self-knowledge distillation, our method overcomes the limitations of conventional approaches, which often fail to retain critical information in compact models. Extensive experiments on publicly available datasets-chest X-ray images, lung computed tomography scans, and brain magnetic resonance imaging scans-demonstrate the superior performance and robustness of our approach compared to existing methods.