🤖 AI Summary
This work addresses the challenges faced by tabular foundation models when handling medium- to large-scale data under limited context lengths and degraded inference performance due to input distribution shifts. We propose the first fast, budget-aware, anytime-available, black-box-compatible, and interpretable hard context optimization method. Our approach explicitly identifies high-value samples and critical predictive features without requiring access to the model’s internal architecture, by estimating their “importance scores” via online KernelSHAP regression and integrating iterative refinement, value-guided sampling, and multi-fidelity pruning. Evaluated across multiple large-scale, high-dimensional tabular datasets—particularly under noisy conditions and data augmentation scenarios—our method significantly outperforms existing baselines, establishing a new state of the art in test-time context optimization.
📝 Abstract
Tabular foundation models (TFMs) have emerged as a powerful paradigm for in-context learning on structured data, enabling direct prediction on new tabular tasks without task-specific training. However, their effectiveness is constrained by context length limits, restricting application to medium-scale data and degrading performance when inference-time data exceed pretraining size distributions. Our work introduces VIP-COP, estimating the Value of Importance for Prediction of training examples and features for hard Context OPtimization for TFMs. Its explicit selection mechanism suppresses noise and isolates influential data, enabling the model to also benefit from data augmentation by prioritizing high-value augmented samples and features. VIP-COP is (i) fast, boosting performance often within minutes of optimization, based on an online KernelSHAP-based regression with iterative refinement, value-guided context sampling, and multi-fidelity pruning; (ii) budget-aware and any-time, improving with additional test-time compute unlike heuristics that produce fixed contexts; (iii) model-aware yet fully black-box, requiring no access to model internals, making it compatible with both proprietary and open-source TFMs; (iv) interpretable, identifying discrete ``Very Important Predictors'' (samples and features) that maximize signal-to-noise, which makes it (v) robust, isolating high-value data from noise. In contrast, soft-prompt optimization requires model gradients, produces abstract latent tokens, and lacks explicit signal discrimination. Extensive experiments show that VIP-COP consistently outperforms heuristic and optimized baselines across large-scale high-dimensional testbeds, including data augmentation and data-noise settings, establishing a new state of the art in test-time context refinement for TFMs.