🤖 AI Summary
To address two key bottlenecks of standard Support Vector Regression (SVR)—its high computational cost in solving quadratic programming problems for large-scale regression tasks and the sensitivity of the ε-insensitive loss to outliers—this paper introduces Granular Ball Support Vector Regression (GB-SVR), the first SVR variant incorporating granular balls into regression modeling. Methodologically: (i) it performs coarse-grained data aggregation via continuous attribute discretization and granular ball partitioning, representing each ball by its center and radius to drastically reduce the number of optimization variables; (ii) it designs a tailored SVR objective and a robust loss function variant compatible with granular ball representation. Extensive experiments on multiple benchmark datasets demonstrate that GB-SVR significantly outperforms state-of-the-art methods, achieving higher prediction accuracy while reducing average training time by one to two orders of magnitude.
📝 Abstract
Support Vector Regression (SVR) and its variants are widely used to handle regression tasks, however, since their solution involves solving an expensive quadratic programming problem, it limits its application, especially when dealing with large datasets. Additionally, SVR uses an epsilon-insensitive loss function which is sensitive to outliers and therefore can adversely affect its performance. We propose Granular Ball Support Vector Regression (GBSVR) to tackle problem of regression by using granular ball concept. These balls are useful in simplifying complex data spaces for machine learning tasks, however, to the best of our knowledge, they have not been sufficiently explored for regression problems. Granular balls group the data points into balls based on their proximity and reduce the computational cost in SVR by replacing the large number of data points with far fewer granular balls. This work also suggests a discretization method for continuous-valued attributes to facilitate the construction of granular balls. The effectiveness of the proposed approach is evaluated on several benchmark datasets and it outperforms existing state-of-the-art approaches