🤖 AI Summary
This paper addresses the problem of optimizing user engagement in French news headline recommendation under noisy contextual interference. We propose a preference-learning-based contextual bandit framework that eliminates the need for explicit exploration—demonstrating theoretically and empirically that implicit exploration suffices for accurate click-through rate (CTR) prediction and policy optimization in realistic noisy settings. Our method integrates online interactive data collection, multi-strategy A/B testing, and lightweight preference modeling. Notably, we conduct the first systematic evaluation of machine translation quality on cross-lingual news engagement prediction. Experimental results show a +12.3% improvement in CTR prediction accuracy, validating the effectiveness of our exploration-free design. The approach offers a scalable, low-overhead paradigm for resource-constrained industrial news recommendation systems, bridging the gap between theoretical bandit learning and practical deployment constraints.
📝 Abstract
This study explores strategies for optimizing news headline recommendations through preference-based learning. Using real-world data of user interactions with French-language online news posts, we learn a headline recommender agent under a contextual bandit setting. This allows us to explore the impact of translation on engagement predictions, as well as the benefits of different interactive strategies on user engagement during data collection. Our results show that explicit exploration may not be required in the presence of noisy contexts, opening the door to simpler but efficient strategies in practice.