ALPBench: A Benchmark for Attribution-level Long-term Personal Behavior Understanding

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing recommendation systems struggle to model users’ long-term, fine-grained preferences and lack the ability to understand new items at the attribute level. To address this limitation, this work proposes replacing traditional item-level recommendation with attribute-combination prediction as the evaluation objective and introduces ALPBench—the first benchmark specifically designed for attribute-level modeling of long-term user behavior. ALPBench reformulates historical user interactions into natural language sequences, leveraging the reasoning and generalization capabilities of large language models (LLMs) to enable interpretable and verifiable modeling of multi-attribute interactions and persistent interests. Experimental results reveal significant limitations of current LLMs in predicting complex attribute combinations, thereby establishing a novel evaluation paradigm for personalized recommendation systems.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models have highlighted their potential for personalized recommendation, where accurately capturing user preferences remains a key challenge. Leveraging their strong reasoning and generalization capabilities, LLMs offer new opportunities for modeling long-term user behavior. To systematically evaluate this, we introduce ALPBench, a Benchmark for Attribution-level Long-term Personal Behavior Understanding. Unlike item-focused benchmarks, ALPBench predicts user-interested attribute combinations, enabling ground-truth evaluation even for newly introduced items. It models preferences from long-term historical behaviors rather than users'explicitly expressed requests, better reflecting enduring interests. User histories are represented as natural language sequences, allowing interpretable, reasoning-based personalization. ALPBench enables fine-grained evaluation of personalization by focusing on the prediction of attribute combinations task that remains highly challenging for current LLMs due to the need to capture complex interactions among multiple attributes and reason over long-term user behavior sequences.
Problem

Research questions and friction points this paper is trying to address.

personalized recommendation
long-term behavior
attribute-level preference
user modeling
LLM evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attribution-level Personalization
Long-term Behavior Modeling
Natural Language User History
Attribute Combination Prediction
LLM-based Recommendation
🔎 Similar Papers
No similar papers found.
L
Lu Ren
KuaiShou Inc., Beijing, China
J
Junda She
KuaiShou Inc., Beijing, China
Xinchen Luo
Xinchen Luo
kuaishou
Tao Wang
Tao Wang
ByteDance
Machine TranslationNatural Language Processing
X
Xin Ye
KuaiShou Inc., Beijing, China
X
Xu Zhang
KuaiShou Inc., Beijing, China
M
Muxuan Wang
KuaiShou Inc., Beijing, China
X
Xiao Yang
KuaiShou Inc., Beijing, China
C
Chenguang Wang
KuaiShou Inc., Beijing, China
F
Fei Xie
KuaiShou Inc., Beijing, China
Y
Yiwei Zhou
KuaiShou Inc., Beijing, China
D
Danjun Wu
KuaiShou Inc., Beijing, China
Guodong Zhang
Guodong Zhang
xAI
Machine Learning
Y
Yifei Hu
KuaiShou Inc., Beijing, China
G
Guoying Zheng
KuaiShou Inc., Beijing, China
S
Shu-Jun Yang
KuaiShou Inc., Beijing, China
X
Xing-Yao Wang
KuaiShou Inc., Beijing, China
S
Shiyao Wang
KuaiShou Inc., Beijing, China
Y
Yukun Zhou
KuaiShou Inc., Beijing, China
F
Fangkai Yang
KuaiShou Inc., Beijing, China
S
Size Li
KuaiShou Inc., Beijing, China
K
Kuo Cai
KuaiShou Inc., Beijing, China
Qiang Luo
Qiang Luo
Principal Investigator, ISTBI (类脑智能科学与技术研究院), Fudan University
Computational PsychiatryNeuroImageComplex Causal Models
R
Ruiming Tang
KuaiShou Inc., Beijing, China
H
Hangxiu Li
KuaiShou Inc., Beijing, China
Kun Gai
Kun Gai
Senior Director & Researcher, Alibaba Group
Machine LearningComputational Advertising