π€ AI Summary
This work addresses the challenge that traditional implicit feedback ranking models struggle to disentangle an itemβs intrinsic utility from nonlinear contextual effects, the latter of which often dominate observed outcomes. The authors propose a semiparametric ranking framework that models log-scores as the sum of an interpretable utility parameter and a flexible nonparametric covariate effect approximated by a neural network, with both components jointly learned via maximum likelihood estimation. They establish identifiability conditions for the model and, under random design, provide the first high-probability existence guarantee for the estimator, along with minimax-optimal non-asymptotic error bounds for both the parametric and nonparametric components. Experiments on synthetic data and the ATP tennis dataset demonstrate the methodβs effectiveness, with empirical results closely aligning with the theoretical guarantees.
π Abstract
Classical latent-score ranking models often fail to distinguish objects' intrinsic scores from contextual effects, which are typically nonlinear and can dominate the observed outcomes. To address this, we introduce a semiparametric ranking framework in which the log-score of each object is modeled as the sum of a utility parameter and a nonparametric covariate effect. Within this framework, we establish model identifiability under mild regularity and connectivity conditions. For estimation, we approximate the covariate effect using a neural network and estimate the parameters via maximum likelihood. Under random design assumptions, we prove that the resulting estimator exists with high probability and derive non-asymptotic error bounds that achieve minimax optimality for both the parametric and nonparametric components. Numerical experiments on both synthetic data and an ATP tennis dataset are conducted to support our findings.