🤖 AI Summary
To address low accuracy and poor cross-dataset generalizability in code smell detection for large-scale software systems, this paper proposes a machine learning framework integrating data balancing, feature selection, and hyperparameter optimization. Specifically, it employs SMOTE for minority-class oversampling, Pearson correlation–based feature filtering, and three hyperparameter tuning strategies—grid search, random search, and Bayesian optimization—to systematically evaluate eight mainstream classifiers, including XGBoost, AdaBoost, and Random Forest. Experimental results demonstrate that AdaBoost achieves 100% accuracy, while XGBoost and Random Forest attain 99%; all three significantly outperform baseline methods across precision, recall, F1-score, and AUC. This work constitutes the first systematic validation of multi-strategy collaborative optimization in code smell detection, establishing a novel paradigm for high-accuracy, robustly generalizable, and reproducible automated software quality analysis.
📝 Abstract
This study addresses the challenge of detecting code smells in large-scale software systems using machine learning (ML). Traditional detection methods often suffer from low accuracy and poor generalization across different datasets. To overcome these issues, we propose a machine learning-based model that automatically and accurately identifies code smells, offering a scalable solution for software quality analysis. The novelty of our approach lies in the use of eight diverse ML algorithms, including XGBoost, AdaBoost, and other classifiers, alongside key techniques such as the Synthetic Minority Over-sampling Technique (SMOTE) for class imbalance and Pearson correlation for efficient feature selection. These methods collectively improve model accuracy and generalization. Our methodology involves several steps: first, we preprocess the data and apply SMOTE to balance the dataset; next, Pearson correlation is used for feature selection to reduce redundancy; followed by training eight ML algorithms and tuning hyperparameters through Grid Search, Random Search, and Bayesian Optimization. Finally, we evaluate the models using accuracy, F-measure, and confusion matrices. The results show that AdaBoost, Random Forest, and XGBoost perform best, achieving accuracies of 100%, 99%, and 99%, respectively. This study provides a robust framework for detecting code smells, enhancing software quality assurance, and demonstrating the effectiveness of a comprehensive, optimized ML approach.