Unleashing the Power of Image-Tabular Self-Supervised Learning via Breaking Cross-Tabular Barriers

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing self-supervised multimodal methods suffer from rigid tabular modeling and strong dependence on specific cohorts, hindering the learning of transferable, cross-center medical knowledge. To address this, we propose CITab, a cross-cohort self-supervised framework featuring two key innovations: (1) semantic-aware tabular encoding—integrating column-header semantics into table representation—and (2) prototype-guided mixture-of-linear-layers (P-MoLin), enabling decoupled joint representation learning of tabular and imaging data. CITab employs joint pretraining via contrastive learning and masked tabular reconstruction, facilitating both cross-cohort knowledge transfer and disentangled learning of medical concepts. Evaluated on Alzheimer’s disease diagnosis across three public cohorts (4,461 subjects), CITab significantly outperforms state-of-the-art methods, demonstrating superior generalizability and clinical applicability.

Technology Category

Application Category

📝 Abstract
Multi-modal learning integrating medical images and tabular data has significantly advanced clinical decision-making in recent years. Self-Supervised Learning (SSL) has emerged as a powerful paradigm for pretraining these models on large-scale unlabeled image-tabular data, aiming to learn discriminative representations. However, existing SSL methods for image-tabular representation learning are often confined to specific data cohorts, mainly due to their rigid tabular modeling mechanisms when modeling heterogeneous tabular data. This inter-tabular barrier hinders the multi-modal SSL methods from effectively learning transferrable medical knowledge shared across diverse cohorts. In this paper, we propose a novel SSL framework, namely CITab, designed to learn powerful multi-modal feature representations in a cross-tabular manner. We design the tabular modeling mechanism from a semantic-awareness perspective by integrating column headers as semantic cues, which facilitates transferrable knowledge learning and the scalability in utilizing multiple data sources for pretraining. Additionally, we propose a prototype-guided mixture-of-linear layer (P-MoLin) module for tabular feature specialization, empowering the model to effectively handle the heterogeneity of tabular data and explore the underlying medical concepts. We conduct comprehensive evaluations on Alzheimer's disease diagnosis task across three publicly available data cohorts containing 4,461 subjects. Experimental results demonstrate that CITab outperforms state-of-the-art approaches, paving the way for effective and scalable cross-tabular multi-modal learning.
Problem

Research questions and friction points this paper is trying to address.

Breaks cross-tabular barriers in self-supervised learning for medical data.
Enhances transferable knowledge across diverse cohorts using semantic-aware tabular modeling.
Handles tabular data heterogeneity via prototype-guided feature specialization.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates column headers as semantic cues for tabular data modeling
Uses prototype-guided mixture-of-linear layers for tabular feature specialization
Enables cross-tabular self-supervised learning across diverse medical cohorts
🔎 Similar Papers
No similar papers found.
Y
Yibing Fu
Department of Biomedical Engineering, National University of Singapore
Y
Yunpeng Zhao
Department of Biomedical Engineering, National University of Singapore
Zhitao Zeng
Zhitao Zeng
National University of Singapore
Vision-Language Models
C
Cheng Chen
Department of Electrical and Electronic Engineering, The University of Hong Kong; School of Biomedical Engineering, The University of Hong Kong
Yueming Jin
Yueming Jin
Assistant Professor, National University of Singapore
Medical Image AnalysisSurgical AI&RoboticsMultimodal Learning