Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning

📅 2025-05-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generalization due to label scarcity, unreliable feature engineering, and high inference latency from test-time LLM invocation in few-shot tabular learning, this paper proposes an *implicit knowledge distillation framework at training time*, enabling the first directed transfer of latent priors from large language models (LLMs) to tabular models. Our method comprises four key components: (i) implicit-space knowledge distillation, (ii) feature-value weighted fusion, (iii) LLM-tabular joint representation alignment, and (iv) semi-supervised optimization—supporting both unsupervised pretraining and unlabeled data augmentation while eliminating test-time LLM dependency entirely. Evaluated on multiple few-shot tabular benchmarks, our approach achieves state-of-the-art performance, particularly under extreme settings (≤5 samples per class), where it demonstrates significantly improved robustness over text prompting and test-time knowledge extraction baselines.

Technology Category

Application Category

📝 Abstract
Few-shot tabular learning, in which machine learning models are trained with a limited amount of labeled data, provides a cost-effective approach to addressing real-world challenges. The advent of Large Language Models (LLMs) has sparked interest in leveraging their pre-trained knowledge for few-shot tabular learning. Despite promising results, existing approaches either rely on test-time knowledge extraction, which introduces undesirable latency, or text-level knowledge, which leads to unreliable feature engineering. To overcome these limitations, we propose Latte, a training-time knowledge extraction framework that transfers the latent prior knowledge within LLMs to optimize a more generalized downstream model. Latte enables general knowledge-guided downstream tabular learning, facilitating the weighted fusion of information across different feature values while reducing the risk of overfitting to limited labeled data. Furthermore, Latte is compatible with existing unsupervised pre-training paradigms and effectively utilizes available unlabeled samples to overcome the performance limitations imposed by an extremely small labeled dataset. Extensive experiments on various few-shot tabular learning benchmarks demonstrate the superior performance of Latte, establishing it as a state-of-the-art approach in this domain
Problem

Research questions and friction points this paper is trying to address.

Transferring LLMs' latent knowledge for few-shot tabular learning
Overcoming unreliable feature engineering in tabular data
Enhancing performance with limited labeled and unlabeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latte transfers LLMs' latent knowledge for tabular learning
Training-time extraction avoids test-time latency issues
Combines unsupervised pre-training with few-shot learning
Ruxue Shi
Ruxue Shi
Jilin University Grad Student
Tabular LearningData Mining
Hengrui Gu
Hengrui Gu
North Carolina State University
Knowledge Maintenance
Hangting Ye
Hangting Ye
Jilin University
Machine LearningData Mining
Y
Yiwei Dai
Jilin University, Changchun, China
X
Xu Shen
Jilin University, Changchun, China
X
Xin Wang
Jilin University, Changchun, China