๐ค AI Summary
To address the challenges of modeling high-frequency sharp signals in tabular data, poor generalization of deep neural networks under low-label regimes, and the lack of effective augmentation strategies for self-supervised learning, this paper proposes a neural-tree hybrid autoencoder framework. It tightly couples a deep autoencoder with an oblivious soft decision tree, employing a dual-encoder architecture and sample-adaptive gating to generate model-driven, complementary input viewsโwithout explicit data augmentation. Joint training is achieved via cross-reconstruction loss and a shared decoder, while spectral analysis reveals complementary inductive biases between the two components. Evaluated on multiple low-label tabular classification and regression tasks, the method consistently outperforms state-of-the-art deep models and supervised tree-based baselines, demonstrating superior representation learning capability and generalization.
๐ Abstract
Deep neural networks often under-perform on tabular data due to their sensitivity to irrelevant features and a spectral bias toward smooth, low-frequency functions. These limitations hinder their ability to capture the sharp, high-frequency signals that often define tabular structure, especially under limited labeled samples. While self-supervised learning (SSL) offers promise in such settings, it remains challenging in tabular domains due to the lack of effective data augmentations. We propose a hybrid autoencoder that combines a neural encoder with an oblivious soft decision tree (OSDT) encoder, each guided by its own stochastic gating network that performs sample-specific feature selection. Together, these structurally different encoders and model-specific gating networks implement model-based augmentation, producing complementary input views tailored to each architecture. The two encoders, trained with a shared decoder and cross-reconstruction loss, learn distinct yet aligned representations that reflect their respective inductive biases. During training, the OSDT encoder (robust to noise and effective at modeling localized, high-frequency structure) guides the neural encoder toward representations more aligned with tabular data. At inference, only the neural encoder is used, preserving flexibility and SSL compatibility. Spectral analysis highlights the distinct inductive biases of each encoder. Our method achieves consistent gains in low-label classification and regression across diverse tabular datasets, outperforming deep and tree-based supervised baselines.