🤖 AI Summary
Existing information diffusion prediction models lack a unified evaluation framework across tasks of varying difficulty.
Method: We propose the “performance characteristic curve” paradigm, which quantifies sequence randomness via information entropy and establishes a scaling law linking entropy to prediction accuracy—enabling performance normalization across sequence lengths, network sizes, and randomness levels. Our approach integrates multivariate scaling analysis, diffusion dynamics modeling, and a homologous three-category baseline validation framework.
Contribution/Results: We formally define model capability as a systematic, uncertainty-robust metric—moving beyond single-score evaluations. Validated on eight state-of-the-art models, our method significantly enhances discriminability among hard-to-distinguish models, unifies performance characterization across all three baseline categories, and demonstrates both broad applicability and practical utility in real-world diffusion prediction assessment.
📝 Abstract
The information diffusion prediction on social networks aims to predict future recipients of a message, with practical applications in marketing and social media. While different prediction models all claim to perform well, general frameworks for performance evaluation remain limited. Here, we aim to identify a performance characteristic curve for a model, which captures its performance on tasks of different complexity. We propose a metric based on information entropy to quantify the randomness in diffusion data. We then identify a scaling pattern between the randomness and the prediction accuracy of the model. By properly adjusting the variables, data points by different sequence lengths, system sizes, and randomness can all collapse into a single curve. The curve captures a model's inherent capability of making correct predictions against increased uncertainty, which we regard as the performance characteristic curve of the model. The validity of the curve is tested by three prediction models in the same family, reaching conclusions in line with existing studies. In addition, we apply the curve to successfully assess the performance of eight state-of-the-art models, providing a clear and comprehensive evaluation even for models that are challenging to differentiate with conventional metrics. Our work reveals a pattern underlying the data randomness and prediction accuracy. The performance characteristic curve provides a new way to evaluate models' performance systematically, and sheds light on future studies on other frameworks for model evaluation.