🤖 AI Summary
This study addresses the challenges of Alzheimer’s disease (AD) diagnosis posed by small-sample, high-dimensional, and incomplete multimodal biomarker tabular data, which existing deep learning approaches struggle to model effectively. To this end, the authors propose TAP-GPT—the first domain-adapted large language model for tabular data tailored to AD prediction—built upon the TableGPT2 architecture. TAP-GPT leverages tabular prompting for few-shot fine-tuning and integrates feature selection, robust modeling of missing data, and a self-reflection mechanism. It uniquely enables structured reasoning and modality-aware interpretability while supporting multi-agent clinical decision integration. Evaluated on four ADNI-derived datasets, TAP-GPT significantly outperforms conventional machine learning methods and foundational models under few-shot and missing-data conditions, achieving performance comparable to general-purpose large language models.
📝 Abstract
Accurate diagnosis of Alzheimer's disease (AD) requires handling tabular biomarker data, yet such data are often small and incomplete, where deep learning models frequently fail to outperform classical methods. Pretrained large language models (LLMs) offer few-shot generalization, structured reasoning, and interpretable outputs, providing a powerful paradigm shift for clinical prediction. We propose TAP-GPT Tabular Alzheimer's Prediction GPT, a domain-adapted tabular LLM framework built on TableGPT2 and fine-tuned for few-shot AD classification using tabular prompts rather than plain texts. We evaluate TAP-GPT across four ADNI-derived datasets, including QT-PAD biomarkers and region-level structural MRI, amyloid PET, and tau PET for binary AD classification. Across multimodal and unimodal settings, TAP-GPT improves upon its backbone models and outperforms traditional machine learning baselines in the few-shot setting while remaining competitive with state-of-the-art general-purpose LLMs. We show that feature selection mitigates degradation in high-dimensional inputs and that TAP-GPT maintains stable performance under simulated and real-world missingness without imputation. Additionally, TAP-GPT produces structured, modality-aware reasoning aligned with established AD biology and shows greater stability under self-reflection, supporting its use in iterative multi-agent systems. To our knowledge, this is the first systematic application of a tabular-specialized LLM to multimodal biomarker-based AD prediction, demonstrating that such pretrained models can effectively address structured clinical prediction tasks and laying the foundation for tabular LLM-driven multi-agent clinical decision-support systems. The source code is publicly available on GitHub: https://github.com/sophie-kearney/TAP-GPT.