🤖 AI Summary
In multi-task dwell time (DT) prediction, models suffer from systematic underestimation of medium-length dwell times due to overreliance on spurious correlations between click-through rate (CTR) and DT. To address this, we propose ORCA, a causal disentanglement framework that— for the first time in multi-task DT prediction—separates beneficial (positive) from harmful (negative) transfer via causal reasoning. ORCA breaks spurious correlations through feature-level counterfactual intervention and preserves constructive task interactions via an instance-wise inverse weighting module. Crucially, it operates orthogonally to the backbone architecture, ensuring model-agnosticism and deployment simplicity. Experiments demonstrate that ORCA improves DT prediction metrics by 10.6% on average without degrading CTR performance, while significantly mitigating overfitting to extremely short or long dwell times. This work establishes a novel paradigm for causal multi-task learning in recommender systems.
📝 Abstract
Dwell time (DT) is a critical post-click metric for evaluating user preference in recommender systems, complementing the traditional click-through rate (CTR). Although multi-task learning is widely adopted to jointly optimize DT and CTR, we observe that multi-task models systematically collapse their DT predictions to the shortest and longest bins, under-predicting the moderate durations. We attribute this moderate-duration bin under-representation to over-reliance on the CTR-DT spurious correlation, and propose ORCA to address it with causal-decoupling. Specifically, ORCA explicitly models and subtracts CTR's negative transfer while preserving its positive transfer. We further introduce (i) feature-level counterfactual intervention, and (ii) a task-interaction module with instance inverse-weighting, weakening CTR-mediated effect and restoring direct DT semantics. ORCA is model-agnostic and easy to deploy. Experiments show an average 10.6% lift in DT metrics without harming CTR. Code is available at https://github.com/Chrissie-Law/ORCA-Mitigating-Over-Reliance-for-Multi-Task-Dwell-Time-Prediction-with-Causal-Decoupling.