Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

176K/year
🤖 AI Summary
This study investigates the functional mechanisms and depth redundancy of Transformer-based tabular foundation models during inference. Through large-scale, layer-wise dynamic analysis of six state-of-the-art tabular in-context learning models, it systematically uncovers—for the first time—the evolution of predictive capability with model depth and the underlying spatial dynamics across layers. Building on these insights, the authors propose a lightweight single-layer recurrent architecture that achieves comparable performance using only 20% of the original model’s parameters, thereby validating the hypothesis of depth redundancy. Integrating mechanistic interpretability, inter-layer dynamic tracking, and parameter-efficient design, this work establishes a new paradigm for efficient tabular modeling.
📝 Abstract
Transformer-based tabular foundation models (TFMs) dominate small to medium tabular predictive benchmark tasks, yet their inference mechanisms remain largely unexplored. We present the first large-scale mechanistic study of layerwise dynamics in 6 state-of-the-art tabular in-context learning models. We explore how predictions emerge across depth, identify distinct stages of inference and reveal latent-space dynamics that differ from those of language models. Our findings indicate substantial depthwise redundancy across multiple models, suggesting iterative refinement with overlapping computations during inference stages. Guided by these insights, we design a proof-of-concept, looped single-layer model that uses only 20% of the original model's parameters while achieving comparable performance. The code is available at https://github.com/amirbalef/is_one_layer_enough.
Problem

Research questions and friction points this paper is trying to address.

tabular foundation models
inference dynamics
layerwise redundancy
in-context learning
transformer
Innovation

Methods, ideas, or system contributions that make the work stand out.

tabular foundation models
in-context learning
layerwise dynamics
depthwise redundancy
single-layer architecture
🔎 Similar Papers