🤖 AI Summary
This work addresses the challenge of automatically constructing efficient, compact, and high-performing multi-model ensembles for unsupervised anomaly detection without access to labeled data. The paper proposes MetaEns, a novel framework that achieves fully unsupervised ensemble model selection for the first time. MetaEns leverages meta-learning to predict the marginal gain of each candidate model with respect to the current ensemble, incorporating a submodular optimization objective enhanced by diversity-aware discounting and family-level risk regularization. This enables diversity-conscious greedy selection with adaptive early stopping. Extensive experiments across 39 real-world datasets demonstrate that MetaEns significantly outperforms existing unsupervised ensemble methods using fewer models, while improving average precision and effectively avoiding redundancy and performance saturation.
📝 Abstract
Unsupervised outlier detection is attractive because it eliminates the need for labeled data. Moreover, forming multi-model ensembles can improve detection robustness. However, composing an ensemble without labeled data is challenging. Naively composed ensembles can suffer from ensemble saturation, where redundant or unreliable detection models degrade performance and incur unnecessary computation. We propose MetaEns, an automatic unsupervised framework for selecting ensembles of outlier detection models. Using labeled meta-datasets, MetaEns learns a model that predicts marginal ensemble gains, estimating the expected improvement from adding a candidate model to a partially constructed ensemble. At test time, this learned signal is combined with a submodular-inspired proxy objective that enforces diminishing returns through diversity-aware discounting and family-level risk regularization, thereby enabling greedy sequential selection with adaptive early stopping. As a result, MetaEns constructs compact, high-quality ensembles without access to ground-truth labels. Experiments on 39 real-world datasets show that MetaEns consistently outperforms state-of-the-art unsupervised selectors and ensemble baselines, achieving higher average precision while using fewer models.