Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning

๐Ÿ“… 2025-10-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Predicting hashtag popularity is critical for่ˆ†ๆƒ… analysis and targeted advertising, yet existing approaches face key limitations: traditional regression models neglect semantic context, while large language models (LLMs) lack precision in numerical forecasting. This paper proposes an interpretable hybrid framework: an instruction-tuned LLM serves as a contextual reasoner, generating human-readable rationales regarding topic relevance, audience reach, and temporal salienceโ€”these are then structured into enhanced features for a classical regression model to perform final popularity prediction. We introduce and publicly release HashView, the first large-scale benchmark dataset for hashtag popularity prediction, comprising 7,532 hashtags. Experiments demonstrate that our method achieves up to a 2.8% reduction in RMSE and a 30% improvement in Pearson correlation coefficient over state-of-the-art baselines, effectively balancing predictive accuracy with human-interpretable decision rationale.

Technology Category

Application Category

๐Ÿ“ Abstract
Hashtag trends ignite campaigns, shift public opinion, and steer millions of dollars in advertising spend, yet forecasting which tag goes viral is elusive. Classical regressors digest surface features but ignore context, while large language models (LLMs) excel at contextual reasoning but misestimate numbers. We present BuzzProphet, a reasoning-augmented hashtag popularity prediction framework that (1) instructs an LLM to articulate a hashtag's topical virality, audience reach, and timing advantage; (2) utilizes these popularity-oriented rationales to enrich the input features; and (3) regresses on these inputs. To facilitate evaluation, we release HashView, a 7,532-hashtag benchmark curated from social media. Across diverse regressor-LLM combinations, BuzzProphet reduces RMSE by up to 2.8% and boosts correlation by 30% over baselines, while producing human-readable rationales. Results demonstrate that using LLMs as context reasoners rather than numeric predictors injects domain insight into tabular models, yielding an interpretable and deployable solution for social media trend forecasting.
Problem

Research questions and friction points this paper is trying to address.

Predicting hashtag popularity using contextual reasoning
Combining LLM insights with classical regression models
Improving forecast accuracy for social media trends
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM generates popularity rationales for hashtags
Rationales enrich input features for regressors
Combines LLM reasoning with numerical regression
Y
Yifei Xu
National University of Singapore, Singapore
Jiaying Wu
Jiaying Wu
National University of Singapore
Natural Language ProcessingData MiningMis/DisinformationSocial Computing
Herun Wan
Herun Wan
Xi'an Jiaotong University
network analysislarge language model
Y
Yang Li
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing, China
Z
Zhen Hou
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing, China
M
Min-Yen Kan
National University of Singapore, Singapore