Objective-Induced Bias and Search Dynamics in Multiobjective Unsupervised Feature Selection

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
This study addresses the search bias and performance variations in multi-objective unsupervised feature selection arising from differences in objective function design, regularization direction for subset size, and initialization strategies. Through systematic evaluation on synthetic data, the authors investigate six objective combinations that jointly minimize or maximize subset size alongside accuracy, silhouette coefficient, or PCA reconstruction loss. The findings reveal the critical influence of the chosen objectives on the quality of the Pareto front and search dynamics. Notably, using PCA reconstruction loss as the primary objective efficiently yields compact feature subsets with strong predictive performance—comparable to methods directly optimizing supervised accuracy—whereas objectives based on the silhouette coefficient tend to converge to trivial, low-cardinality solutions with poor generalization.
📝 Abstract
Unsupervised feature selection is commonly formulated as a multiobjective optimisation problem that jointly optimises subset quality and subset size. Yet the behaviour of this formulation depends critically on the choice of evaluation objective, the direction of subset-size regularisation, and the initialisation strategy. We study these factors in a controlled setting using a synthetic dataset with known informative, redundant, and irrelevant feature types. Six formulations are compared by combining three evaluation objectives: accuracy, silhouette score, and PCA reconstruction loss with subset-size minimisation or maximisation. The results show that formulation strongly affects both search dynamics and the quality of the resulting Pareto front. Silhouette-based formulations exhibit a strong bias toward trivial low-cardinality solutions and remain weak proxies for predictive performance. In contrast, the proposed PCA loss objective produces compact subsets with test accuracy comparable to subsets obtained by directly optimising supervised accuracy. Overall, the study shows that objective design is central to effective multiobjective unsupervised feature selection.
Problem

Research questions and friction points this paper is trying to address.

unsupervised feature selection
multiobjective optimisation
objective-induced bias
Pareto front
search dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

multiobjective optimization
unsupervised feature selection
PCA reconstruction loss
Pareto front
objective-induced bias
🔎 Similar Papers
2024-05-13European Conference on Artificial IntelligenceCitations: 1