🤖 AI Summary
This study investigates the predictive validity of ICPC contest tiers—regional contests, super-regional contests, and the World Finals—as well as Codeforces ratings, for performance in the ICPC World Finals. Leveraging empirical data from 2015–2024, we apply Kendall’s τ rank correlation, multi-level data fusion, and cross-platform performance comparison to systematically quantify inter-tier performance consistency—the first such large-scale, longitudinal analysis. Results show moderate predictive power of super-regional contests (τ = 0.407) and stronger predictive power of Codeforces ratings (τ = 0.596) for World Finals outcomes. Notably, the Northern Eurasia regional contest exhibits the highest consistency with super-regional performance (τ = 0.666), underscoring its superior discriminative capacity and stability. We propose an evidence-driven paradigm for competitive programming contest design optimization and publicly release all datasets and analysis scripts to support reproducible research in contest evaluation and team selection.
📝 Abstract
Competitive programming (CP) contests are often treated as interchangeable proxies for algorithmic skill, yet the extent to which results at lower contest tiers anticipate performance at higher tiers, and how closely any tier resembles the ubiquitous online-contest circuit, remains unclear. We analyze ten years (2015--2024) of International Collegiate Programming Contest (ICPC) standings, comprising five long-running superregional championships (Africa &Arab, Asia East, Asia West, North America, and Northern Eurasia), associated local regionals of North America and Northern Eurasia, and the World Finals. For 366 World Finalist teams (2021--2024) we augment the dataset with pre-contest Codeforces ratings. Pairwise rank alignment is measured with Kendall's $ au$. Overall, superregional ranks predict World Final ranks only moderately (weighted $ au=0.407$), but regional-to-superregional consistency varies widely: Northern Eurasia exhibits the strongest alignment ($ au=0.521$) while Asia West exhibits the weakest ($ au=0.188$). Internal consistency within a region can exceed its predictive value for Worlds -- e.g., Northern Eurasia and North America regionals vs. superregionals ($ au=0.666$ and $ au=0.577$, respectively). Codeforces ratings correlate more strongly with World Final results ($ au=0.596$) than any single ICPC tier, suggesting that high-frequency online contests capture decisive skill factors that many superregional sets miss. We argue that contest organizers can improve both fairness and pedagogical value by aligning problem style and selection rules with the formats that demonstrably differentiate teams, in particular the Northern-Eurasian model and well-curated online rounds. All data, scripts, and additional analyses are publicly released to facilitate replication and further study.