🤖 AI Summary
This study addresses key challenges in spatial proteomics—poor cross-study generalizability, weak multi-marker modeling, and limited task adaptability—by introducing VirTues, the first general-purpose foundation model for biological tissues. Methodologically, VirTues employs a dual-dimensional tokenization strategy that jointly encodes spatial coordinates and protein markers, incorporates an interpretable, scalable multi-scale attention mechanism, and is pre-trained on diverse, heterogeneous spatial proteomic datasets using a Transformer architecture. Crucially, it enables zero-shot transfer: without fine-tuning, it supports cross-cohort clinical diagnosis, disease mechanism interpretation, and patient case retrieval. Experiments demonstrate that VirTues significantly outperforms state-of-the-art methods in tasks such as cancer versus non-cancer tissue classification, establishing, for the first time, a universal representation learning paradigm for spatial proteomics.
📝 Abstract
Spatial proteomics technologies have transformed our understanding of complex tissue architectures by enabling simultaneous analysis of multiple molecular markers and their spatial organization. The high dimensionality of these data, varying marker combinations across experiments and heterogeneous study designs pose unique challenges for computational analysis. Here, we present Virtual Tissues (VirTues), a foundation model framework for biological tissues that operates across the molecular, cellular and tissue scale. VirTues introduces innovations in transformer architecture design, including a novel tokenization scheme that captures both spatial and marker dimensions, and attention mechanisms that scale to high-dimensional multiplex data while maintaining interpretability. Trained on diverse cancer and non-cancer tissue datasets, VirTues demonstrates strong generalization capabilities without task-specific fine-tuning, enabling cross-study analysis and novel marker integration. As a generalist model, VirTues outperforms existing approaches across clinical diagnostics, biological discovery and patient case retrieval tasks, while providing insights into tissue function and disease mechanisms.