A hierarchy tree data structure for behavior-based user segment representation

📅 2025-08-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cold-start and modeling challenges for new or inactive users persist in recommender systems. Method: This paper proposes a hierarchical, tree-structured user modeling approach grounded in fine-grained, product-specific behavioral interactions. It automatically constructs semantically interpretable user hierarchies by jointly leveraging user attributes and social graphs to enable behavior-aware clustering and node aggregation. Furthermore, it introduces the first industrial-scale listwise learning-to-rank framework optimized directly for Normalized Discounted Cumulative Gain (NDCG). Contribution/Results: Offline experiments demonstrate substantial improvements in ranking quality over conventional group-based modeling. Online A/B tests show significant gains in key metrics—including music recommendation click-through rate and email notification conversion rate—while maintaining both accuracy and fairness.

Technology Category

Application Category

📝 Abstract
User attributes are essential in multiple stages of modern recommendation systems and are particularly important for mitigating the cold-start problem and improving the experience of new or infrequent users. We propose Behavior-based User Segmentation (BUS), a novel tree-based data structure that hierarchically segments the user universe with various users' categorical attributes based on the users' product-specific engagement behaviors. During the BUS tree construction, we use Normalized Discounted Cumulative Gain (NDCG) as the objective function to maximize the behavioral representativeness of marginal users relative to active users in the same segment. The constructed BUS tree undergoes further processing and aggregation across the leaf nodes and internal nodes, allowing the generation of popular social content and behavioral patterns for each node in the tree. To further mitigate bias and improve fairness, we use the social graph to derive the user's connection-based BUS segments, enabling the combination of behavioral patterns extracted from both the user's own segment and connection-based segments as the connection aware BUS-based recommendation. Our offline analysis shows that the BUS-based retrieval significantly outperforms traditional user cohort-based aggregation on ranking quality. We have successfully deployed our data structure and machine learning algorithm and tested it with various production traffic serving billions of users daily, achieving statistically significant improvements in the online product metrics, including music ranking and email notifications. To the best of our knowledge, our study represents the first list-wise learning-to-rank framework for tree-based recommendation that effectively integrates diverse user categorical attributes while preserving real-world semantic interpretability at a large industrial scale.
Problem

Research questions and friction points this paper is trying to address.

Proposes hierarchical user segmentation to improve recommendation systems
Mitigates cold-start issues by leveraging user behavior and attributes
Enhances fairness and ranking quality via connection-aware BUS segments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical tree segments users by behavior
NDCG optimizes behavioral representativeness in segments
Social graph enhances fairness in BUS recommendations
🔎 Similar Papers
No similar papers found.
Y
Yang Liu
Meta, Menlo Park, California, USA 94025
X
Xuejiao Kang
Meta, Menlo Park, California, USA 94025
S
Sathya Iyer
Meta, Menlo Park, California, USA 94025
I
Idris Malik
Meta, Menlo Park, California, USA 94025
R
Ruixuan Li
Meta, Menlo Park, California, USA 94025
J
Juan Wang
Meta, Menlo Park, California, USA 94025
X
Xinchen Lu
Meta, Menlo Park, California, USA 94025
Xiangxue Zhao
Xiangxue Zhao
University of Maryland -College Park; Facebook
Data-Driven Design and ControlReal-time predictionReinforcement learning
Dayong Wang
Dayong Wang
Meta, Menlo Park, California, USA 94025
M
Menghan Liu
Meta, Menlo Park, California, USA 94025
I
Isaac Liu
Meta, Menlo Park, California, USA 94025
Feng Liang
Feng Liang
Department of Statistics, University of Illinois at Urbana-Champaign
Information TheoryBayesian StatisticsMachine Learning
Y
Yinzhe Yu
Meta, Menlo Park, California, USA 94025