🤖 AI Summary
To address semantic bias induced by staining/scanning variations and noise from irrelevant topological subgraphs in hematoxylin-and-eosin (H&E)-stained whole-slide image (WSI) survival analysis, this paper proposes a Dual Causal Graph Learning (DCGL) framework that jointly models causal relationships at both semantic and topological levels. Methodologically, DCGL establishes a dual-structure causal model: (i) a cross-scale adaptive feature disentanglement module isolates stain-invariant semantic features; and (ii) a differentiable Bernoulli causal subgraph sampling mechanism enables interpretable discovery and intervention on topological structures. The framework integrates graph-based multiple instance learning, disentangled supervision, contrastive learning, and differentiable sampling for end-to-end optimization of slide-level representations. Extensive experiments on multiple public WSI datasets demonstrate significant improvements in generalizability and clinical interpretability of survival prediction. Moreover, DCGL serves as a plug-and-play enhancement for mainstream MIL models. The code is publicly available.
📝 Abstract
Graph-based Multiple Instance Learning (MIL) is widely used in survival analysis with Hematoxylin and Eosin (H&E)-stained whole slide images (WSIs) due to its ability to capture topological information. However, variations in staining and scanning can introduce semantic bias, while topological subgraphs that are not relevant to the causal relationships can create noise, resulting in biased slide-level representations. These issues can hinder both the interpretability and generalization of the analysis. To tackle this, we introduce a dual structural causal model as the theoretical foundation and propose a novel and interpretable dual causal graph-based MIL model, C$^2$MIL. C$^2$MIL incorporates a novel cross-scale adaptive feature disentangling module for semantic causal intervention and a new Bernoulli differentiable causal subgraph sampling method for topological causal discovery. A joint optimization strategy combining disentangling supervision and contrastive learning enables simultaneous refinement of both semantic and topological causalities. Experiments demonstrate that C$^2$MIL consistently improves generalization and interpretability over existing methods and can serve as a causal enhancement for diverse MIL baselines. The code is available at https://github.com/mimic0127/C2MIL.