🤖 AI Summary
Conventional rating systems such as Elo rely on manually specified prior parameters, lacking empirical, data-driven justification.
Method: This paper proposes a data-driven, end-to-end method for automatic calibration of core Elo parameters using real head-to-head match outcomes. We formulate parameter learning as a maximum-likelihood optimization problem targeting win-probability prediction accuracy, solved via gradient-based optimization coupled with Monte Carlo simulation.
Contribution/Results: To our knowledge, this is the first approach enabling fully data-driven, end-to-end Elo parameter estimation. The framework is generalizable to multi-player settings and extensible to other rating systems (e.g., TrueSkill). Evaluated on multiple real-world esports and board-game datasets, our method achieves an average 8.3% improvement in win-probability prediction accuracy over empirically tuned baselines, demonstrating both the efficacy and generalizability of data-driven parameter optimization.
📝 Abstract
This study aims to provide a data-driven approach for empirically tuning and validating rating systems, focusing on the Elo system. Well-known rating frameworks, such as Elo, Glicko, TrueSkill systems, rely on parameters that are usually chosen based on probabilistic assumptions or conventions, and do not utilize game-specific data. To address this issue, we propose a methodology that learns optimal parameter values by maximizing the predictive accuracy of match outcomes. The proposed parameter-tuning framework is a generalizable method that can be extended to any rating system, even for multiplayer setups, through suitable modification of the parameter space. Implementation of the rating system on real and simulated gameplay data demonstrates the suitability of the data-driven rating system in modeling player performance.