π€ AI Summary
This work addresses the inflexibility of existing tabular foundation models in adapting to downstream tasks during inference, as conventional fine-tuning or parameter-efficient methods incur substantial computational overhead and rely heavily on internal model architecture. To overcome these limitations, we propose a lightweight, architecture-agnostic input-space residual adapter that operates under a frozen backbone. The adapter learns task-specific input perturbations through end-to-end training and incorporates an identity fallback mechanism, allowing the validation set to automatically determine whether adaptation should be activatedβthus balancing performance and robustness. Without modifying any model weights, our approach achieves significant gains on TabArena-Lite, with TabICLv2-Retouche surpassing the baseline by +56 Elo points and attaining a Pareto-optimal trade-off between predictive quality and training/inference efficiency.
π Abstract
Tabular foundation models (TFMs), such as TabPFN-2.6, TabICLv2, ConTextTab, Mitra, LimiX, and TabDPT, achieve strong zero-shot performance through in-context learning, but their inductive biases remain fixed at inference time. Adapting a pretrained TFM to a specific dataset or task typically requires either full fine-tuning, which is computationally expensive, or parameter-efficient tuning methods (PEFT) such as LoRA, which must be tailored to the internal architecture of each TFM. Furthermore, the evidence on whether weight-space fine-tuning improves accuracy or calibration is mixed \citep{tanna_exploring_2026,rubachev_finetuning_2025}. We introduce TFM-Retouche, a lightweight input-space residual adapter that is architecture-agnostic by design with respect to the frozen TFM backbone. TFM-Retouche learns a small residual correction in the input space to align the input data with the inductive biases of the pretrained model. The adapter is trained end-to-end through the frozen TFM, with a post-training identity guard that falls back to the unmodified TFM whenever adaptation does not help on held-out validation. On TabArena-Lite (51 datasets spanning binary classification, multiclass classification, and regression), TabICLv2-Retouche -- the framework instantiated on TabICLv2 -- is the top-ranked method on the leaderboard with light per-task tuning and ensembling, lifting aggregate Elo by +56 over the frozen TabICLv2 base and sitting on the Pareto front of predictive quality versus both training and inference time.