🤖 AI Summary
In computational pathology, real-world histopathological images are often degraded by multiple factors—including noise, blur, low resolution, and staining inconsistency—yet existing methods are typically task-specific and lack generalizability. To address this, we propose LPFM, the first unified low-level vision foundation model for pathology, integrating a contrastive pre-trained encoder with a text-guided conditional diffusion mechanism to enable feature disentanglement and task-adaptive inference. Trained on 190 million unlabeled images and 87,810 whole-slide images, LPFM jointly optimizes diverse tasks such as image restoration and virtual staining. Evaluated across 66 benchmark tasks, it outperforms state-of-the-art methods in 56 (p < 0.01), achieving average PSNR gains of 10–15% and SSIM improvements of 12–18%. LPFM demonstrates strong cross-task generalization and clinical applicability, establishing a scalable foundation for low-level histopathological image analysis.
📝 Abstract
Foundation models have revolutionized computational pathology by achieving remarkable success in high-level diagnostic tasks, yet the critical challenge of low-level image enhancement remains largely unaddressed. Real-world pathology images frequently suffer from degradations such as noise, blur, and low resolution due to slide preparation artifacts, staining variability, and imaging constraints, while the reliance on physical staining introduces significant costs, delays, and inconsistency. Although existing methods target individual problems like denoising or super-resolution, their task-specific designs lack the versatility to handle the diverse low-level vision challenges encountered in practice. To bridge this gap, we propose the first unified Low-level Pathology Foundation Model (LPFM), capable of enhancing image quality in restoration tasks, including super-resolution, deblurring, and denoising, as well as facilitating image translation tasks like virtual staining (H&E and special stains), all through a single adaptable architecture. Our approach introduces a contrastive pre-trained encoder that learns transferable, stain-invariant feature representations from 190 million unlabeled pathology images, enabling robust identification of degradation patterns. A unified conditional diffusion process dynamically adapts to specific tasks via textual prompts, ensuring precise control over output quality. Trained on a curated dataset of 87,810 whole slied images (WSIs) across 34 tissue types and 5 staining protocols, LPFM demonstrates statistically significant improvements (p<0.01) over state-of-the-art methods in most tasks (56/66), achieving Peak Signal-to-Noise Ratio (PSNR) gains of 10-15% for image restoration and Structural Similarity Index Measure (SSIM) improvements of 12-18% for virtual staining.