Block-regularized 5$ imes$2 Cross-validated McNemar's Test for Comparing Two Classification Algorithms

📅 2023-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional McNemar’s test suffers from high variance in error-rate estimation and low statistical power due to reliance on a single data split. To address this, we propose a novel nonparametric method integrating block-regularized 5×2 cross-validation with McNemar’s test. Our approach repeatedly performs 5×2 cross-validation to generate ten dependent contingency tables, then applies a block-regularization strategy to aggregate them into a single effective test table. This substantially reduces Type I error inflation while improving statistical power. Extensive experiments across multiple synthetic and real-world datasets demonstrate that the proposed method maintains strict Type I error control (≈0.05) and achieves, on average, over 35% higher statistical power than conventional McNemar’s test. The framework thus provides a more robust and reliable basis for comparing classification algorithms.
📝 Abstract
In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because the hold-out (HO) method in the test merely uses a single train-validation split that usually produces a highly varied estimation of the error rates. In contrast, a cross-validation (CV) method repeats the HO method in multiple times and produces a stable estimation. Therefore, a CV method has a great advantage to improve the power of McNemar's test. Among all types of CV methods, a block-regularized 5$ imes$2 CV (BCV) has been shown in many previous studies to be superior to the other CV methods in the comparison task of algorithms because the 5$ imes$2 BCV can produce a high-quality estimator of the error rate by regularizing the numbers of overlapping records between all training sets. In this study, we compress the 10 correlated contingency tables in the 5$ imes$2 BCV to form an effective contingency table. Then, we define a 5$ imes$2 BCV McNemar's test on the basis of the effective contingency table. We demonstrate the reasonable type I error and the promising power of the proposed 5$ imes$2 BCV McNemar's test on multiple simulated and real-world data sets.
Problem

Research questions and friction points this paper is trying to address.

Classification Methods
Performance Evaluation
Cross-validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Block Regularization Cross-Validation
Enhanced McNemar Test
Stability and Accuracy Improvement
🔎 Similar Papers
No similar papers found.
J
Jing Yang
School of Automation and Software Engineering, Shanxi University, Taiyuan, China, 030031
Ruibo Wang
Ruibo Wang
King Abdullah University of Science and Technology (KAUST)
Stochastic GeometryWireless Communications
Y
Yijun Song
School of Modern Educational Technology, Shanxi University, Taiyuan, China, 030006
Jihong Li
Jihong Li
Shanghai university
wireless communications