🤖 AI Summary
Deploying foundation models in computational pathology remains challenging due to high computational costs and clinical impracticality. Method: This paper proposes a cross-magnification distillation framework that transfers knowledge from a high-magnification teacher model to a lightweight low-magnification student model. We introduce a dual-level distillation mechanism—global representation alignment and local token mapping—optimized end-to-end. The student model processes 5× lower-magnification whole-slide images (WSIs) using a compact backbone architecture. Results: Evaluated on six cancer pathology tasks, the distilled model achieves accuracy within <1% of the large teacher model, attains an inference speed of 8.8 slides/minute (30× acceleration), and significantly reduces GPU memory and FLOPs. Moreover, it demonstrates strong cross-institutional generalization. To our knowledge, this is the first work to systematically apply cross-magnification knowledge distillation to compress pathology foundation models, establishing a new paradigm for efficient clinical deployment.
📝 Abstract
Foundation models (FM) have transformed computational pathology but remain computationally prohibitive for clinical deployment due to their massive parameter counts and high-magnification processing requirements. Here, we introduce XMAG, a lightweight FM developed through corss-magnification distillation that transfers knowledge from state-of-the-art 20x magnification teacher to an efficient 5x magnification student architecture. XMAG employs a compact backbone and operates entirely at 5x, requiring 11.3 times fewer patches per whole slide image (WSI) compared to existing approaches. Our Novel distillation framework incorporates dual-level knowledge transfer, aligning both global image representations and local spatial token mapping. We trained XMAG on 3.49 million images curated from publicly available datasets and evaluated performance across six clinically relevant histopathology analysis tasks spanning multiple cancer types. XMAG achieved diagnostic accuracy within 1% of substantially larger foundation models while delivering 30-fold processing acceleration, reaching 8.8 WSIs per minute processing speed. Our cross-institutional validation confirmed robust generalization. Further, we developed an end-to-end training strategy to further boost our model's performance to approach the larger FMs' performance. These results establish cross-magnification distillation as a viable approach for deploying FM capabilities in resource-constrained clinical environments, potentially enabling real-time pathology AI integration.