AHDMIL: Asymmetric Hierarchical Distillation Multi-Instance Learning for Fast and Accurate Whole-Slide Image Classification

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

To address the high inference cost in whole-slide image (WSI) classification—stemming from processing thousands of high-resolution patches—this paper proposes AHDMIL, a two-stage framework. Stage one employs a lightweight dual-branch pre-screening network for joint low- and high-resolution analysis and coarse filtering. Stage two introduces dynamic multiple-instance learning coupled with asymmetric hierarchical distillation and a learnable Kolmogorov–Arnold classifier based on Chebyshev polynomials to enhance fine-grained discriminability. By integrating self-distillation, asymmetric distillation, and learnable activation functions, AHDMIL significantly reduces computational overhead. Evaluated on four public benchmarks, AHDMIL achieves 1.2–2.1× faster inference speed and delivers a 5.3% relative accuracy improvement on Camelyon16, while consistently outperforming state-of-the-art methods in AUC, F1-score, and other key metrics.

Technology Category

Application Category

📝 Abstract

Although multi-instance learning (MIL) has succeeded in pathological image classification, it faces the challenge of high inference costs due to the need to process thousands of patches from each gigapixel whole slide image (WSI). To address this, we propose AHDMIL, an Asymmetric Hierarchical Distillation Multi-Instance Learning framework that enables fast and accurate classification by eliminating irrelevant patches through a two-step training process. AHDMIL comprises two key components: the Dynamic Multi-Instance Network (DMIN), which operates on high-resolution WSIs, and the Dual-Branch Lightweight Instance Pre-screening Network (DB-LIPN), which analyzes corresponding low-resolution counterparts. In the first step, self-distillation (SD), DMIN is trained for WSI classification while generating per-instance attention scores to identify irrelevant patches. These scores guide the second step, asymmetric distillation (AD), where DB-LIPN learns to predict the relevance of each low-resolution patch. The relevant patches predicted by DB-LIPN have spatial correspondence with patches in high-resolution WSIs, which are used for fine-tuning and efficient inference of DMIN. In addition, we design the first Chebyshev-polynomial-based Kolmogorov-Arnold (CKA) classifier in computational pathology, which improves classification performance through learnable activation layers. Extensive experiments on four public datasets demonstrate that AHDMIL consistently outperforms previous state-of-the-art methods in both classification performance and inference speed. For example, on the Camelyon16 dataset, it achieves a relative improvement of 5.3% in accuracy and accelerates inference by 1.2.times. Across all datasets, area under the curve (AUC), accuracy, f1 score, and brier score show consistent gains, with average inference speedups ranging from 1.2 to 2.1 times. The code is available.

Problem

Research questions and friction points this paper is trying to address.

Reducing high inference costs in whole-slide image classification

Identifying irrelevant patches for efficient processing

Improving classification accuracy and speed simultaneously

Innovation

Methods, ideas, or system contributions that make the work stand out.

Asymmetric Hierarchical Distillation for fast WSI classification

Dual-Branch Lightweight Network for patch pre-screening

Chebyshev-polynomial-based CKA classifier for pathology

🔎 Similar Papers

Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism