🤖 AI Summary
Early diagnosis of Alzheimer’s disease (AD) is hindered by the scarcity of labeled tabular biomarker data. Method: We propose TAP-GPT, the first framework to systematically apply large language models (LLMs) to few-shot classification of structured medical tables. Built upon TableGPT2, it integrates in-context learning with parameter-efficient qLoRA fine-tuning to construct a clinical tabular-data–specific few-shot prompting mechanism. Contribution/Results: On AD/non-AD binary classification, TAP-GPT achieves significantly higher accuracy and robustness than both general-purpose LLMs (e.g., Llama-3) and specialized tabular models (e.g., TabPFN), even with extremely limited labeled samples. This work demonstrates the feasibility of adapting LLMs to structured biomedical data and establishes a novel paradigm for few-shot medical diagnosis.
📝 Abstract
Early and accurate diagnosis of Alzheimer's disease (AD), a complex neurodegenerative disorder, requires analysis of heterogeneous biomarkers (e.g., neuroimaging, genetic risk factors, cognitive tests, and cerebrospinal fluid proteins) typically represented in a tabular format. With flexible few-shot reasoning, multimodal integration, and natural-language-based interpretability, large language models (LLMs) offer unprecedented opportunities for prediction with structured biomedical data. We propose a novel framework called TAP-GPT, Tabular Alzheimer's Prediction GPT, that adapts TableGPT2, a multimodal tabular-specialized LLM originally developed for business intelligence tasks, for AD diagnosis using structured biomarker data with small sample sizes. Our approach constructs few-shot tabular prompts using in-context learning examples from structured biomedical data and finetunes TableGPT2 using the parameter-efficient qLoRA adaption for a clinical binary classification task of AD or cognitively normal (CN). The TAP-GPT framework harnesses the powerful tabular understanding ability of TableGPT2 and the encoded prior knowledge of LLMs to outperform more advanced general-purpose LLMs and a tabular foundation model (TFM) developed for prediction tasks. To our knowledge, this is the first application of LLMs to the prediction task using tabular biomarker data, paving the way for future LLM-driven multi-agent frameworks in biomedical informatics.