MARS: Magnitude-Aware Rank Statistics

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

250K/year

🤖 AI Summary

This study addresses a critical limitation of traditional Critical Difference (CD) diagrams, which rely solely on discrete rankings and ignore the magnitude of performance differences among models, thereby distorting comparative evaluations. To overcome this “magnitude-blind” deficiency, the authors propose MARS—a novel method that integrates the scale of performance gaps into non-parametric multiple comparison statistics for the first time. MARS dynamically weights and rescales model rankings through a relative margin coefficient, combining non-parametric rank analysis, a relative-margin weighting mechanism, and dynamic boundary projection. This integration substantially enhances the granularity and reliability of model comparisons. Extensive experiments demonstrate that statistical diagrams generated by MARS more accurately and intuitively reflect true performance differences across models.

📝 Abstract

Comprehensive evaluation of machine learning models is the key to make sure that they perform as robustly and consistently as desired. In order to summarize the experimental results and pick a winner, Critical Difference (CD) diagrams are used. Standard CD diagrams rely on discrete ranks, discarding the magnitude of performance gaps between models, raising an issue which we call magnitude-blindness. In order to address this issue, we propose Magnitude-Aware Rank Statistics (MARS) that incorporates a relative margin coefficient as a weight for the discrete ranks. This coefficient scales ranks based on the distance between the best and worst performers, with a dynamic projection to handle boundary cases. Followed by the calculation of a CD value, MARS results in a more realistic statistical representation of differences of model performances and more insights on how methods actually perform in vast and extensive experimental settings.

Problem

Research questions and friction points this paper is trying to address.

Critical Difference diagrams

magnitude-blindness

model evaluation

performance gaps

rank statistics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Magnitude-Aware Rank Statistics

Critical Difference diagrams

relative margin coefficient