FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

Existing large structured data models struggle to efficiently handle ultra-scale datasets due to the quadratic complexity of attention mechanisms, representational degradation, and mismatches between pretraining and real-world data distributions. To address these limitations, this work proposes FEAT, a novel architecture featuring dual-axis linear complexity that integrates adaptive bidirectional Mamba-2 (AFBM) with convolutional gated linear attention (Conv-GLA) for efficient cross-sample modeling. Furthermore, FEAT incorporates a hybrid structural causal model and a stable reconstruction objective to enhance generalization and robustness. Evaluated across 11 real-world datasets, FEAT achieves state-of-the-art zero-shot performance, outperforming all baselines, while accelerating inference by up to 40× and maintaining computational costs that scale linearly with data size.

Technology Category

Application Category

📝 Abstract

Structured data is foundational to healthcare, finance, e-commerce, and scientific data management. Large structured-data models (LDMs) extend the foundation model paradigm to unify heterogeneous datasets for tasks such as classification, regression, and decision support. However, existing LDMs face major limitations. First, most rely on sample-wise self-attention, whose O(N^2) complexity limits the sample count. Second, linear sequence models often degrade representations due to hidden-state compression and artificial causal bias. Third, synthetic-only pre-training often fails to match real-world distributions. We propose FEAT, a linear-complexity foundation model for extremely large structured data. FEAT introduces a multi-layer dual-axis architecture that replaces quadratic attention with hybrid linear encoding. The architecture combines adaptive-fusion bi-Mamba-2 (AFBM) for local sample dependencies and convolutional gated linear attention (Conv-GLA) for global memory. This design enables linear-complexity cross-sample modeling while preserving expressive representations. To improve robustness, FEAT adopts a hybrid structural causal model pipeline and a stable reconstruction objective. Experiments on 11 real-world datasets show that FEAT consistently outperforms baselines in zero-shot performance, while scaling linearly and achieving up to 40x faster inference.

Problem

Research questions and friction points this paper is trying to address.

structured data

foundation model

computational complexity

representation degradation

pre-training distribution mismatch

Innovation

Methods, ideas, or system contributions that make the work stand out.

linear-complexity

dual-axis architecture

bi-Mamba-2