🤖 AI Summary
This work addresses the challenge that conventional AI models struggle to adapt during inference when new columns are dynamically added to tabular data, lacking the ability to leverage such additions in an unsupervised manner. To this end, the paper introduces the novel task of “tabular incremental reasoning” and proposes an optimization framework grounded in information bottleneck theory, enabling dynamic integration of newly introduced columns without retraining. The approach combines large language model placeholders with a pre-trained TabAdapter to incorporate external knowledge and features an incremental sample compression module designed to extract task-relevant information. Evaluated on eight public datasets, the method achieves state-of-the-art performance, significantly enhancing model robustness and practical utility under evolving column structures.
📝 Abstract
Tabular data is a fundamental form of data structure. The evolution of table analysis tools reflects humanity's continuous progress in data acquisition, management, and processing. The dynamic changes in table columns arise from technological advancements, changing needs, data integration, etc. However, the standard process of training AI models on tables with fixed columns and then performing inference is not suitable for handling dynamically changed tables. Therefore, new methods are needed for efficiently handling such tables in an unsupervised manner. In this paper, we introduce a new task, Tabular Incremental Inference (TabII), which aims to enable trained models to incorporate new columns during the inference stage, enhancing the practicality of AI models in scenarios where tables are dynamically changed. Furthermore, we demonstrate that this new task can be framed as an optimization problem based on the information bottleneck theory, which emphasizes that the key to an ideal tabular incremental inference approach lies in minimizing mutual information between tabular data and representation while maximizing between representation and task labels. Under this guidance, we design a TabII method with Large Language Model placeholders and Pretrained TabAdapter to provide external knowledge and Incremental Sample Condensation blocks to condense the task-relevant information given by incremental column attributes. Experimental results across eight public datasets show that TabII effectively utilizes incremental attributes, achieving state-of-the-art performance.