🤖 AI Summary
This work addresses the challenge that user click data, often corrupted by position bias, selection bias, and trust bias, fails to accurately reflect true relevance. To mitigate these intertwined biases, the authors propose a novel ranking framework grounded in causal inference that uniquely integrates structural causal models with information-theoretic principles. By leveraging conditional mutual information, the framework quantifies bias leakage and introduces a decoupling regularizer to jointly model multiple biases. Coupled with a doubly robust estimator, the method significantly reduces bias interference and enhances ranking performance on standard Learning-to-Rank benchmarks, demonstrating particularly strong efficacy in scenarios where multiple biases interact.
📝 Abstract
In web search and recommendation systems, user clicks are widely used to train ranking models. However, click data is heavily biased, i.e., users tend to click higher-ranked items (position bias), choose only what was shown to them (selection bias), and trust top results more (trust bias). Without explicitly modeling these biases, the true relevance of ranked items cannot be correctly learned from clicks. Existing Unbiased Learning-to-Rank (ULTR) methods mainly correct position bias and rely on propensity estimation, but they cannot measure remaining bias, provide risk guarantees, or jointly handle multiple bias sources. To overcome these challenges, this paper introduces a novel causal learning-based ranking framework that extends ULTR by combining Structural Causal Models (SCMs) with information-theoretic tools. SCMs specify how clicks are generated and help identify the true relevance signal from click data, while conditional mutual information, measures how much bias leaks into the learned relevance estimates. We use this leakage measure to define a rigorous notion of disentanglement and include it as a regularizer during model training to reduce bias. In addition, we incorporate a causal inference estimator, i.e., doubly robust estimator, to ensure more reliable risk estimation. Experiments on standard Learning-to-Rank benchmarks show that our method consistently reduces measured bias leakage and improves ranking performance, especially in realistic scenarios where multiple biases-such as position and trust bias-interact strongly.