Ranked Set Sampling-Based Multilayer Perceptron: Improving Generalization via Variance-Based Bounds

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multilayer perceptrons (MLPs) suffer from limited generalization due to high variance in empirical loss estimates. Method: This paper introduces ranked set sampling (RSS)—a novel data resampling strategy for neural network training—to construct ordered data structures that reduce empirical loss variance. We theoretically establish that RSS yields lower variance than simple random sampling and enhances both diversity and consistency among base learners. Furthermore, we propose two variance-aware fusion strategies—integrating empirical exponential loss and logistic loss—to improve loss estimation. Results: Extensive experiments across 12 benchmark datasets demonstrate that RSS-MLP consistently reduces empirical loss variance and achieves stable generalization improvements under both loss functions, empirically validating the critical role of variance control in enhancing MLP generalization.

Technology Category

Application Category

📝 Abstract
Multilayer perceptron (MLP), one of the most fundamental neural networks, is extensively utilized for classification and regression tasks. In this paper, we establish a new generalization error bound, which reveals how the variance of empirical loss influences the generalization ability of the learning model. Inspired by this learning bound, we advocate to reduce the variance of empirical loss to enhance the ability of MLP. As is well-known, bagging is a popular ensemble method to realize variance reduction. However, bagging produces the base training data sets by the Simple Random Sampling (SRS) method, which exhibits a high degree of randomness. To handle this issue, we introduce an ordered structure in the training data set by Rank Set Sampling (RSS) to further reduce the variance of loss and develop a RSS-MLP method. Theoretical results show that the variance of empirical exponential loss and the logistic loss estimated by RSS are smaller than those estimated by SRS, respectively. To validate the performance of RSS-MLP, we conduct comparison experiments on twelve benchmark data sets in terms of the two convex loss functions under two fusion methods. Extensive experimental results and analysis illustrate the effectiveness and rationality of the propose method.
Problem

Research questions and friction points this paper is trying to address.

Improving MLP generalization via variance-based error bounds
Reducing empirical loss variance with Ranked Set Sampling
Comparing RSS-MLP performance on benchmark datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Ranked Set Sampling for variance reduction
Introduces ordered structure in training data
Develops RSS-MLP method for better generalization
🔎 Similar Papers
No similar papers found.
F
Feijiang Li
Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China; Key Laboratory of Evolutionary Science Intelligence of Shanxi Province, Taiyuan, China.
L
Liuya Zhang
Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China.
J
Jieting Wang
Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China; Key Laboratory of Evolutionary Science Intelligence of Shanxi Province, Taiyuan, China.
T
Tao Yan
Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China; Key Laboratory of Evolutionary Science Intelligence of Shanxi Province, Taiyuan, China.
Yuhua Qian
Yuhua Qian
山西大学大数据科学与产业研究院
机器学习、数据挖掘、复杂网络