🤖 AI Summary
To address ambiguities in neighbor definition, cold-start user modeling challenges, and performance degradation due to erroneous neighbor selection in clustering-based bandit recommendation systems, this paper proposes a novel Dual-Side Adaptive Compositional Collaborative Bandits framework. The method introduces the first joint user- and item-side modeling mechanism, incorporates a dynamic neighbor discovery strategy based on similarity probability thresholds, and automatically degrades to a single-user bandit model when no suitable neighbors exist. It integrates enhanced Bayesian user similarity modeling, theoretical analysis grounded in linear contextual bandits, and dual-side compositional optimization—yielding a provably tight regret bound. Empirical evaluation on three real-world datasets demonstrates a 2.4% improvement in mean F1-score over state-of-the-art baselines, confirming both theoretical soundness and practical superiority.
📝 Abstract
Clustering bandits have gained significant attention in recommender systems by leveraging collaborative information from neighboring users to better capture target user preferences. However, these methods often lack a clear definition of similar users and face challenges when users with unique preferences lack appropriate neighbors. In such cases, relying on divergent preferences of misidentified neighbors can degrade recommendation quality. To address these limitations, this paper proposes an adaptive Collaborative Combinatorial Bandits algorithm (CoCoB). CoCoB employs an innovative two-sided bandit architecture, applying bandit principles to both the user and item sides. The user-bandit employs an enhanced Bayesian model to explore user similarity, identifying neighbors based on a similarity probability threshold. The item-bandit treats items as arms, generating diverse recommendations informed by the user-bandit's output. CoCoB dynamically adapts, leveraging neighbor preferences when available or focusing solely on the target user otherwise. Regret analysis under a linear contextual bandit setting and experiments on three real-world datasets demonstrate CoCoB's effectiveness, achieving an average 2.4% improvement in F1 score over state-of-the-art methods.