🤖 AI Summary
To address challenges in few-shot learning for tabular data—including difficulty modeling mixed-field features, weak structural priors, and poor generalization—this paper proposes TabFormer, a general-purpose foundational model for tabular data that integrates dual-axis attention with meta-learning. Methodologically, it pioneers the coupling of dual-axis attention with a multi-CLS grouping aggregation mechanism to alternately capture intra-field and inter-field dependencies; introduces a label-aware hierarchical decision routing head with in-context learning (ICL) for context-adaptive inference; and employs causal-prior-guided synthetic data augmentation alongside a meta-training paradigm. TabFormer maintains scikit-learn API compatibility. On public benchmarks, it outperforms gradient-boosting methods and matches state-of-the-art tabular models in accuracy, demonstrating the effectiveness and robustness of dual-axis attention and episodic meta-training for few-shot tabular modeling.
📝 Abstract
Tabular data drive most real-world machine learning applications, yet building general-purpose models for them remains difficult. Mixed numeric and categorical fields, weak feature structure, and limited labeled data make scaling and generalization challenging. To this end, we introduce Orion-Bix, a tabular foundation model that combines biaxial attention with meta-learned in-context reasoning for few-shot tabular learning. Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-CLS summarization to capture both local and global dependencies efficiently. A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing. Meta-trained on synthetically generated, structurally diverse tables with causal priors, Orion-Bix learns transferable inductive biases across heterogeneous data. Delivered as a scikit-learn compatible foundation model, it outperforms gradient-boosting baselines and remains competitive with state-of-the-art tabular foundation models on public benchmarks, showing that biaxial attention with episodic meta-training enables robust, few-shot-ready tabular learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-BiX .