🤖 AI Summary
This study addresses the challenges of deploying large deep learning models in clinical settings—namely high computational cost, latency, and privacy concerns—which hinder their applicability in resource-constrained medical environments. The work presents a systematic review of efficient, lightweight architectures in medical imaging, introducing a novel taxonomy that categorizes them into three technical paradigms: lightweight CNNs, lightweight Transformers, and linear-complexity models. It comprehensively evaluates model compression techniques—including pruning, quantization, knowledge distillation, and low-rank decomposition—demonstrating their effectiveness in reducing hardware demands while preserving diagnostic performance. Furthermore, the paper proposes a deployment pathway tailored for edge-side intelligence, offering a complete technical roadmap for implementing high-accuracy AI models in clinical edge environments, thereby significantly lowering computational and memory overhead without compromising diagnostic accuracy.
📝 Abstract
Deep learning has revolutionized medical image analysis, playing a vital role in modern clinical applications. However, the deployment of large-scale models in real-world clinical settings remains challenging due to high computational costs, latency constraints, and patient data privacy concerns associated with cloud-based processing. To address these bottlenecks, this review provides a comprehensive synthesis of efficient and lightweight deep learning architectures specifically tailored for the medical domain. We categorize the landscape of modern efficient models into three primary streams: Convolutional Neural Networks (CNNs), Lightweight Transformers, and emerging Linear Complexity Models. Furthermore, we examine key model compression strategies (including pruning, quantization, knowledge distillation, and low-rank factorization) and evaluate their efficacy in maintaining diagnostic performance while reducing hardware requirements. By identifying current limitations and discussing the transition toward on-device intelligence, this review serves as a roadmap for researchers and practitioners aiming to bridge the gap between high-performance AI and resource-constrained clinical environments.