Ranked Set Sampling-Based Multilayer Perceptron: Improving Generalization via Variance-Based Bounds

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Multilayer perceptrons (MLPs) suffer from limited generalization due to high variance in empirical loss estimates. Method: This paper introduces ranked set sampling (RSS)—a novel data resampling strategy for neural network training—to construct ordered data structures that reduce empirical loss variance. We theoretically establish that RSS yields lower variance than simple random sampling and enhances both diversity and consistency among base learners. Furthermore, we propose two variance-aware fusion strategies—integrating empirical exponential loss and logistic loss—to improve loss estimation. Results: Extensive experiments across 12 benchmark datasets demonstrate that RSS-MLP consistently reduces empirical loss variance and achieves stable generalization improvements under both loss functions, empirically validating the critical role of variance control in enhancing MLP generalization.

Technology Category

Application Category

📝 Abstract

Multilayer perceptron (MLP), one of the most fundamental neural networks, is extensively utilized for classification and regression tasks. In this paper, we establish a new generalization error bound, which reveals how the variance of empirical loss influences the generalization ability of the learning model. Inspired by this learning bound, we advocate to reduce the variance of empirical loss to enhance the ability of MLP. As is well-known, bagging is a popular ensemble method to realize variance reduction. However, bagging produces the base training data sets by the Simple Random Sampling (SRS) method, which exhibits a high degree of randomness. To handle this issue, we introduce an ordered structure in the training data set by Rank Set Sampling (RSS) to further reduce the variance of loss and develop a RSS-MLP method. Theoretical results show that the variance of empirical exponential loss and the logistic loss estimated by RSS are smaller than those estimated by SRS, respectively. To validate the performance of RSS-MLP, we conduct comparison experiments on twelve benchmark data sets in terms of the two convex loss functions under two fusion methods. Extensive experimental results and analysis illustrate the effectiveness and rationality of the propose method.

Problem

Research questions and friction points this paper is trying to address.

Improving MLP generalization via variance-based error bounds

Reducing empirical loss variance with Ranked Set Sampling

Comparing RSS-MLP performance on benchmark datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Ranked Set Sampling for variance reduction

Introduces ordered structure in training data

Develops RSS-MLP method for better generalization

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Software Engineer, Machine Learning