LassoFlexNet: Flexible Neural Architecture for Tabular Data

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Deep neural networks often underperform tree-based models on tabular data due to irrelevant features, feature heterogeneity, and local irregularities. To address this, this work proposes the LassoFlexNet architecture, which enhances model expressivity through five inductive biases: it employs per-feature embeddings to separately capture linear and nonlinear marginal contributions, integrates a grouped Lasso penalty for sparse variable selection, and introduces a novel Sequential Hierarchical Proximal Adaptive Gradient optimizer incorporating exponential moving averages (EMA). This design effectively breaks undesirable rotational invariance inherent in standard neural networks. Evaluated across 52 benchmark datasets, LassoFlexNet matches or surpasses leading tree-based models, achieving up to a 10% relative performance gain while maintaining strong interpretability.

Technology Category

Application Category

📝 Abstract

Despite their dominance in vision and language, deep neural networks often underperform relative to tree-based models on tabular data. To bridge this gap, we incorporate five key inductive biases into deep learning: robustness to irrelevant features, axis alignment, localized irregularities, feature heterogeneity, and training stability. We propose \emph{LassoFlexNet}, an architecture that evaluates the linear and nonlinear marginal contribution of each input via Per-Feature Embeddings, and sparsely selects relevant variables using a Tied Group Lasso mechanism. Because these components introduce optimization challenges that destabilize standard proximal methods, we develop a \emph{Sequential Hierarchical Proximal Adaptive Gradient optimizer with exponential moving averages (EMA)} to ensure stable convergence. Across $52$ datasets from three benchmarks, LassoFlexNet matches or outperforms leading tree-based models, achieving up to a $10$\% relative gain, while maintaining Lasso-like interpretability. We substantiate these empirical results with ablation studies and theoretical proofs confirming the architecture's enhanced expressivity and structural breaking of undesired rotational invariance.

Problem

Research questions and friction points this paper is trying to address.

tabular data

deep neural networks

tree-based models

inductive biases

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

LassoFlexNet

Tied Group Lasso

Per-Feature Embeddings