Position: From Correlation to Causation: Max-Pooling-Based Multi-Instance Learning Leads to More Robust Whole Slide Image Classification

📅 2024-08-18

📈 Citations: 1

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Attention-based multiple instance learning (MIL) for whole-slide image (WSI) classification is vulnerable to staining artifacts and non-causal tissue morphologies, leading to unreliable patch-level predictions and poor interpretability. Method: We propose FocusMIL, a causally grounded max-pooling MIL framework that systematically demonstrates— for the first time—the superior robustness of max-pooling over attention mechanisms in isolating causal histopathological features while avoiding spurious correlations. Its lightweight, end-to-end trainable architecture requires no auxiliary supervision or complex regularization. Results: Evaluated on two public WSI benchmarks, FocusMIL significantly outperforms state-of-the-art attention-based MIL methods in classification accuracy, patch-level prediction consistency, and heatmap interpretability. It establishes a new paradigm for computational pathology that jointly ensures generalizability and causal validity.

Technology Category

Application Category

📝 Abstract

Although attention-based multi-instance learning (MIL) algorithms have achieved impressive performance on slide-level whole slide image (WSI) classification tasks, they are prone to mistakenly focusing on irrelevant patterns such as staining conditions and tissue morphology, leading to incorrect patch-level predictions and unreliable interpretability. In this paper, we analyze why attention-based methods tend to rely on spurious correlations in their predictions. Furthermore, we revisit max-pooling-based approaches and examine the reasons behind the underperformance of existing methods. We argue that well-trained max-pooling-based MIL models can make predictions based on causal factors and avoid relying on spurious correlations. Building on these insights, we propose a simple yet effective max-pooling-based MIL method (FocusMIL) that outperforms existing mainstream attention-based methods on two datasets. In this position paper, we advocate renewed attention to max-pooling-based methods to achieve more robust and interpretable predictions.

Problem

Research questions and friction points this paper is trying to address.

Attention-based MIL misclassifies WSIs due to irrelevant patterns

Existing max-pooling MIL underperforms despite causal prediction potential

Proposing FocusMIL for robust WSI classification via max-pooling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Max-pooling-based MIL avoids spurious correlations

FocusMIL outperforms attention-based methods

Robust WSI classification via causal factors

🔎 Similar Papers

Rethinking Pre-trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification