🤖 AI Summary
To address the inefficiency of multi-objective feature selection in high-dimensional data caused by feature redundancy and interdependence, this paper proposes a novel Multi-Objective Differential Evolution for Feature Selection (MO-DEFS). MO-DEFS introduces a pioneering four-subpopulation initialization strategy that jointly incorporates feature weighting and redundancy assessment; it further designs a weight-guided mutation mechanism and an adaptive grid-based deduplication strategy to synergistically enhance solution-set diversity and convergence quality. Experimental evaluation on 11 UCI benchmark datasets demonstrates that MO-DEFS achieves a superior Pareto front in optimizing the conflicting objectives of minimizing feature count and classification error rate: the selected feature subsets are, on average, 12.7% more compact and yield 1.9% higher classification accuracy, while exhibiting significantly greater robustness than mainstream algorithms including NSGA-II and MOEA/D.
📝 Abstract
Multiobjective feature selection seeks to determine the most discriminative feature subset by simultaneously optimizing two conflicting objectives: minimizing the number of selected features and the classification error rate. The goal is to enhance the model's predictive performance and computational efficiency. However, feature redundancy and interdependence in high-dimensional data present considerable obstacles to the search efficiency of optimization algorithms and the quality of the resulting solutions. To tackle these issues, we propose a high-dimensional feature selection algorithm based on multiobjective differential evolution. First, a population initialization strategy is designed by integrating feature weights and redundancy indices, where the population is divided into four subpopulations to improve the diversity and uniformity of the initial population. Then, a multiobjective selection mechanism is developed, in which feature weights guide the mutation process. The solution quality is further enhanced through nondominated sorting, with preference given to solutions with lower classification error, effectively balancing global exploration and local exploitation. Finally, an adaptive grid mechanism is applied in the objective space to identify densely populated regions and detect duplicated solutions. Experimental results on 11 UCI datasets of varying difficulty demonstrate that the proposed method significantly outperforms several state-of-the-art multiobjective feature selection approaches regarding feature selection performance.