TabFlex: Scaling Tabular Learning to Millions with Linear Attention

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address the inefficiency and poor scalability of in-context learning (ICL) for large language models (LLMs) on large-scale tabular classification—particularly with million-sample datasets and thousand-dimensional features—this work introduces linear attention into tabular ICL for the first time, overcoming the quadratic complexity bottleneck of standard self-attention. Our method integrates linear attention, a lightweight Transformer architecture, dimensionality reduction, and adaptive data sampling to deliver a training-free, end-to-end scalable inference framework. Evaluated on the poker-hand dataset (>1M samples), our approach achieves single-inference latency of just 5 seconds—over 2× faster than TabPFN and 1.5× faster than XGBoost—while outperforming 25 baseline methods in overall accuracy. It further demonstrates millisecond-level response times and robust generalization across data scales.

Technology Category

Application Category

📝 Abstract

Leveraging the in-context learning (ICL) capability of Large Language Models (LLMs) for tabular classification has gained significant attention for its training-free adaptability across diverse datasets. Recent advancements, like TabPFN, excel in small-scale tabular datasets but struggle to scale for large and complex datasets. Our work enhances the efficiency and scalability of TabPFN for larger datasets by incorporating linear attention mechanisms as a scalable alternative to complexity-quadratic self-attention. Our model, TabFlex, efficiently handles tabular datasets with thousands of features and hundreds of classes, scaling seamlessly to millions of samples. For instance, TabFlex processes the poker-hand dataset with over a million samples in just 5 seconds. Our extensive evaluations demonstrate that TabFlex can achieve over a 2x speedup compared to TabPFN and a 1.5x speedup over XGBoost, outperforming 25 tested baselines in terms of efficiency across a diverse range of datasets. Furthermore, TabFlex remains highly effective on large-scale datasets, delivering strong performance with significantly reduced computational costs, especially when combined with data-efficient techniques such as dimensionality reduction and data sampling.

Problem

Research questions and friction points this paper is trying to address.

Scaling tabular learning to large datasets efficiently

Improving speed and performance over existing methods

Reducing computational costs for massive tabular data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses linear attention for scalability

Handles thousands of features efficiently

Combines with data-efficient techniques

🔎 Similar Papers

No similar papers found.