🤖 AI Summary
This work addresses fairness in bundle recommendation (BR), identifying dual biases in exposure distribution—both at the bundle level and at the individual item level—and examining how user behavioral patterns influence fairness. We systematically evaluate the applicability of existing fairness metrics across multiple granularities using four state-of-the-art BR models on three real-world datasets. Results reveal: (i) substantial discrepancies in fairness assessments across different metrics; (ii) interventions applied solely at the bundle level fail to ensure fairness at the item level; and (iii) when users focus more on bundle-level utility (e.g., holistic satisfaction), item-level exposure fairness paradoxically improves. The study advocates for cross-granularity fairness intervention mechanisms and emphasizes the need for multi-dimensional, fine-grained metric ensembles to holistically assess fairness. Our empirical findings provide both methodological insights and practical foundations for designing fairer BR systems.
📝 Abstract
Recommender systems are known to exhibit fairness issues, particularly on the product side, where products and their associated suppliers receive unequal exposure in recommended results. While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored. This emerging task introduces additional complexity: recommendations are generated at the bundle level, yet user satisfaction and product (or supplier) exposure depend on both the bundle and the individual items it contains. Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting. In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods. We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns. Our results show that exposure patterns differ notably between bundles and items, revealing the need for fairness interventions that go beyond bundle-level assumptions. We also find that fairness assessments vary considerably depending on the metric used, reinforcing the need for multi-faceted evaluation. Furthermore, user behavior plays a critical role: when users interact more frequently with bundles than with individual items, BR systems tend to yield fairer exposure distributions across both levels. Overall, our findings offer actionable insights for building fairer bundle recommender systems and establish a vital foundation for future research in this emerging domain.