🤖 AI Summary
This work addresses the limitation of current large language models (LLMs), which typically rely on static snapshots for user profiling and thus fail to capture the dynamic evolution of user interests in real-world scenarios. Framing streaming user profiling as a continuous state maintenance task, the authors construct a dynamic dataset comprising over 120,000 cross-platform user-generated content items from more than 7,000 users and introduce the first fine-grained user profiling benchmark tailored to realistic streaming settings. By devising an unsupervised evaluation framework that leverages temporal correlations without requiring manual annotations, they reveal a systematic conservative bias in mainstream LLMs—specifically, a tendency to overretain outdated interests and struggle with detecting interest decay. These findings underscore the necessity and efficacy of dynamic user profiling paradigms.
📝 Abstract
Large Language Models (LLMs) have reshaped user profiling, yet current evaluations mainly focus on static data snapshots. This paradigm overlooks the reality of personalized systems, where User-Generated Content (UGC) arrives continuously and fine-grained profile evolve rapidly. To bridge this gap, we introduce StreamProfileBench, a large-scale benchmark for fine-grained streaming user profiling. We formalize streaming user profiling as a continuous state maintenance task and curate a highly authentic dataset comprising over 120,000 UGC posts from 7,000+ real users across five diverse platforms. By leveraging the temporal correlation of user interests, we further propose a novel, annotation-free evaluation framework. Extensive experiments across 14 leading LLMs reveal that continuous profile updating remains an open challenge. Models exhibit a systemic conservative bias, over-retaining past interests while failing to recognize interest decay. Ablation experiments further validate the practical utility and necessity of the streaming paradigm.