🤖 AI Summary
Pathology AI development is hindered by the scarcity of high-quality annotated data and the semantic instability and morphological hallucinations exhibited by existing generative models. To address this, we propose CRAFTS—the first pathology-specific text-to-image foundation model—introducing a novel relevance-constrained alignment framework. Trained on 2.8 million pathology image-text pairs in two stages, CRAFTS mitigates semantic drift via joint semantic alignment loss, ControlNet-based conditional control, multimodal feature disentanglement, and biologically grounded constraints. It enables high-fidelity generation across 30 cancer types and supports precise tissue-structure modulation guided by nuclear segmentation masks or fluorescence maps. Generated images achieve high expert pathological validation. Augmented data significantly improves downstream performance in classification, cross-modal retrieval, self-supervised learning, and visual question answering—effectively alleviating bottlenecks in data privacy and rare phenotype modeling.
📝 Abstract
The development of clinical-grade artificial intelligence in pathology is limited by the scarcity of diverse, high-quality annotated datasets. Generative models offer a potential solution but suffer from semantic instability and morphological hallucinations that compromise diagnostic reliability. To address this challenge, we introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS), the first generative foundation model for pathology-specific text-to-image synthesis. By leveraging a dual-stage training strategy on approximately 2.8 million image-caption pairs, CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy. This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations. Furthermore, CRAFTS-augmented datasets enhance the performance across various clinical tasks, including classification, cross-modal retrieval, self-supervised learning, and visual question answering. In addition, coupling CRAFTS with ControlNet enables precise control over tissue architecture from inputs such as nuclear segmentation masks and fluorescence images. By overcoming the critical barriers of data scarcity and privacy concerns, CRAFTS provides a limitless source of diverse, annotated histology data, effectively unlocking the creation of robust diagnostic tools for rare and complex cancer phenotypes.