🤖 AI Summary
This work addresses the challenge of batch effects in histopathology images—arising from variations in staining protocols and scanning devices—that severely hinder model generalization across clinical sites. To this end, the authors propose Latent Manifold Compression (LMC), an unsupervised representation learning framework that explicitly compresses stain-induced latent manifolds within a single source domain to construct a batch-invariant embedding space. Notably, LMC enables cross-batch image normalization without requiring any target-domain data, thereby eliminating the need for paired or multi-domain training samples. Evaluated on three public and internal benchmarks, the method significantly reduces inter-batch separation and consistently outperforms state-of-the-art approaches in both cross-batch classification and detection tasks, substantially improving model generalization.
📝 Abstract
Batch effects arising from technical variations in histopathology staining protocols, scanners, and acquisition pipelines pose a persistent challenge for computational pathology, hindering cross-batch generalization and limiting reliable deployment of models across clinical sites. In this work, we introduce Latent Manifold Compaction (LMC), an unsupervised representation learning framework that performs image harmonization by learning batch-invariant embeddings from a single source dataset through explicit compaction of stain-induced latent manifolds. This allows LMC to generalize to target domain data unseen during training. Evaluated on three challenging public and in-house benchmarks, LMC substantially reduces batch-induced separations across multiple datasets and consistently outperforms state-of-the-art normalization methods in downstream cross-batch classification and detection tasks, enabling superior generalization.