Mixture Experts with Test-Time Self-Supervised Aggregation for Tabular Imbalanced Regression

📅 2025-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the degradation of model generalization under unknown or shifted test distributions in tabular regression, caused by imbalanced continuous label distributions. To tackle this, we propose the Region-Aware Mixture-of-Experts (RA-MoE) framework. RA-MoE is the first method to perform unsupervised partitioning of the continuous target space: it automatically identifies label regions via Gaussian Mixture Modeling (GMM), trains specialized expert networks per region, and introduces a test-time self-supervised weighting mechanism that dynamically aggregates expert predictions based on input feature similarity. Evaluated on four real-world datasets—house price prediction, bike-sharing demand forecasting, and age estimation—RA-MoE achieves an average 7.1% reduction in MAE across three representative out-of-distribution test settings, demonstrating substantial improvement in out-of-distribution generalization.

Technology Category

Application Category

📝 Abstract
Tabular data serve as a fundamental and ubiquitous representation of structured information in numerous real-world applications, e.g., finance and urban planning. In the realm of tabular imbalanced applications, data imbalance has been investigated in classification tasks with insufficient instances in certain labels, causing the model's ineffective generalizability. However, the imbalance issue of tabular regression tasks is underexplored, and yet is critical due to unclear boundaries for continuous labels and simplifying assumptions in existing imbalance regression work, which often rely on known and balanced test distributions. Such assumptions may not hold in practice and can lead to performance degradation. To address these issues, we propose MATI: Mixture Experts with Test-Time Self-Supervised Aggregation for Tabular Imbalance Regression, featuring two key innovations: (i) the Region-Aware Mixture Expert, which adopts a Gaussian Mixture Model to capture the underlying related regions. The statistical information of each Gaussian component is then used to synthesize and train region-specific experts to capture the unique characteristics of their respective regions. (ii) Test-Time Self-Supervised Expert Aggregation, which dynamically adjusts region expert weights based on test data features to reinforce expert adaptation across varying test distributions. We evaluated MATI on four real-world tabular imbalance regression datasets, including house pricing, bike sharing, and age prediction. To reflect realistic deployment scenarios, we adopted three types of test distributions: a balanced distribution with uniform target frequencies, a normal distribution that follows the training data, and an inverse distribution that emphasizes rare target regions. On average across these three test distributions, MATI achieved a 7.1% improvement in MAE compared to existing methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses data imbalance in tabular regression tasks
Overcomes unclear boundaries for continuous labels
Improves model adaptability to varying test distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Region-Aware Mixture Expert captures related regions
Self-Supervised Aggregation adjusts expert weights dynamically
Gaussian Mixture Model synthesizes region-specific experts
🔎 Similar Papers
No similar papers found.