KMLP: A Scalable Hybrid Architecture for Web-Scale Tabular Data Modeling

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scalability challenges in modeling ultra-large-scale tabular data—characterized by billions of samples and hundreds of heterogeneous numerical features—arising from feature anisotropy, heavy-tailed distributions, and non-stationarity. To this end, we propose KMLP, a hybrid architecture that combines a shallow Kolmogorov–Arnold Network (KAN) at the front end to automatically learn per-feature nonlinear transformations, followed by a gated MLP (gMLP) to capture high-order feature interactions. This is the first approach to integrate KAN with gMLP, establishing an end-to-end scalable modeling paradigm that eliminates the need for manual feature engineering. Evaluated on both public benchmarks and industrial-scale datasets with billions of records, KMLP achieves state-of-the-art performance, with its advantage over mainstream baselines such as GBDT becoming increasingly pronounced as data scale grows.

Technology Category

Application Category

📝 Abstract
Predictive modeling on web-scale tabular data with billions of instances and hundreds of heterogeneous numerical features faces significant scalability challenges. These features exhibit anisotropy, heavy-tailed distributions, and non-stationarity, creating bottlenecks for models like Gradient Boosting Decision Trees and requiring laborious manual feature engineering. We introduce KMLP, a hybrid deep architecture integrating a shallow Kolmogorov-Arnold Network (KAN) front-end with a Gated Multilayer Perceptron (gMLP) backbone. The KAN front-end uses learnable activation functions to automatically model complex non-linear transformations for each feature, while the gMLP backbone captures high-order interactions. Experiments on public benchmarks and an industrial dataset with billions of samples show KMLP achieves state-of-the-art performance, with advantages over baselines like GBDTs increasing at larger scales, validating KMLP as a scalable deep learning paradigm for large-scale web tabular data.
Problem

Research questions and friction points this paper is trying to address.

scalability
tabular data
web-scale
heterogeneous features
non-stationarity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Kolmogorov-Arnold Network
gMLP
scalable tabular modeling
learnable activation functions
web-scale data
🔎 Similar Papers
No similar papers found.
Mingming Zhang
Mingming Zhang
Beihang University
big data
P
Pengfei Shi
Ant Group
Zhiqing Xiao
Zhiqing Xiao
Zhejiang University
MLCVGNN
F
Feng Zhao
Ant Group
G
Guandong Sun
Ant Group
Y
Yulin Kang
Ant Group
R
Ruizhe Gao
Ant Group
N
Ningtao Wang
Ant Group
Xing Fu
Xing Fu
Ant Group
Weiqiang Wang
Weiqiang Wang
ant financials
Machine LearningSimulation
Junbo Zhao
Junbo Zhao
Zhejiang University, ZJU100 Young Professor
AILLMs