🤖 AI Summary
This study addresses the challenge of highly sparse lineup data in basketball, caused by frequent substitutions, which renders traditional statistical metrics noisy and weakly predictive. To overcome this limitation, the paper proposes L-RAPM—a regularized regression–based lineup evaluation method that introduces Bayesian priors on player performance for the first time and incorporates an opponent strength adjustment mechanism to construct a lineup-level net efficiency framework. By leveraging prior information and contextual adjustments, L-RAPM effectively mitigates data sparsity issues in small-sample scenarios. Empirical results demonstrate that the proposed method significantly outperforms existing baselines in predictive performance, with its advantage becoming more pronounced as sample sizes decrease.
📝 Abstract
Identifying combinations of players (that is, lineups) in basketball - and other sports - that perform well when they play together is one of the most important tasks in sports analytics. One of the main challenges associated with this task is the frequent substitutions that occur during a game, which results in highly sparse data. In particular, a National Basketball Association (NBA) team will use more than 600 lineups during a season, which translates to an average lineup having seen the court in approximately 25-30 possessions. Inevitably, any statistics that one collects for these lineups are going to be noisy, with low predictive value. Yet, there is no existing work (in the public at least) that addresses this problem. In this work, we propose a regression-based approach that controls for the opposition faced by each lineup, while it also utilizes information about the players making up the lineups. Our experiments show that L-RAPM provides improved predictive power than the currently used baseline, and this improvement increases as the sample size for the lineups gets smaller.