🤖 AI Summary
Conventional multiple instance learning (MIL) for whole-slide image (WSI) analysis suffers from permutation invariance, neglecting spatial and semantic relationships among instances. Method: We propose a novel “instance jigsaw restoration” paradigm that models WSIs as reconstructable spatial arrangements; specifically, it explicitly captures intrinsic structural dependencies by reconstructing randomly shuffled instance orders. Our approach employs an optimal transport–guided siamese network architecture, enhanced with contrastive learning to refine instance-level similarity metrics—thereby overcoming fundamental limitations of standard MIL. Contribution/Results: This is the first work to introduce jigsaw-based self-supervision into WSI analysis. Evaluated on classification and survival prediction tasks, our framework consistently outperforms state-of-the-art MIL methods, demonstrating both the effectiveness and necessity of explicitly modeling spatial dependencies for pathological representation learning.
📝 Abstract
While multiple instance learning (MIL) has shown to be a promising approach for histopathological whole slide image (WSI) analysis, its reliance on permutation invariance significantly limits its capacity to effectively uncover semantic correlations between instances within WSIs. Based on our empirical and theoretical investigations, we argue that approaches that are not permutation-invariant but better capture spatial correlations between instances can offer more effective solutions. In light of these findings, we propose a novel alternative to existing MIL for WSI analysis by learning to restore the order of instances from their randomly shuffled arrangement. We term this task as cracking an instance jigsaw puzzle problem, where semantic correlations between instances are uncovered. To tackle the instance jigsaw puzzles, we propose a novel Siamese network solution, which is theoretically justified by optimal transport theory. We validate the proposed method on WSI classification and survival prediction tasks, where the proposed method outperforms the recent state-of-the-art MIL competitors. The code is available at https://github.com/xiwenc1/MIL-JigsawPuzzles.