From Newborn to Impact: Bias-Aware Citation Prediction

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenging problem of early citation prediction for newly published papers—characterized by sparse initial citation signals and a highly long-tailed citation distribution—leading to low prediction accuracy and severe bias against low-citation papers. To tackle this, we propose a bias-aware multi-agent graph collaborative learning framework. Methodologically, we introduce a fine-grained scientific impact factor to model latent influence mechanisms, design a two-stage forward propagation architecture, and integrate heterogeneous network embedding, GroupDRO-based robust optimization, and a causal regularization head to jointly achieve debiased, interpretable, and robust modeling. Evaluated on real-world datasets, our approach reduces MAE and RMSLE by approximately 13% and improves NDCG by 5.5% over state-of-the-art baselines. Notably, it significantly enhances fairness and stability in predicting citations for long-tailed, low-citation papers.

Technology Category

Application Category

📝 Abstract
As a key to accessing research impact, citation dynamics underpins research evaluation, scholarly recommendation, and the study of knowledge diffusion. Citation prediction is particularly critical for newborn papers, where early assessment must be performed without citation signals and under highly long-tailed distributions. We identify two key research gaps: (i) insufficient modeling of implicit factors of scientific impact, leading to reliance on coarse proxies; and (ii) a lack of bias-aware learning that can deliver stable predictions on lowly cited papers. We address these gaps by proposing a Bias-Aware Citation Prediction Framework, which combines multi-agent feature extraction with robust graph representation learning. First, a multi-agent x graph co-learning module derives fine-grained, interpretable signals, such as reproducibility, collaboration network, and text quality, from metadata and external resources, and fuses them with heterogeneous-network embeddings to provide rich supervision even in the absence of early citation signals. Second, we incorporate a set of robust mechanisms: a two-stage forward process that routes explicit factors through an intermediate exposure estimate, GroupDRO to optimize worst-case group risk across environments, and a regularization head that performs what-if analyses on controllable factors under monotonicity and smoothness constraints. Comprehensive experiments on two real-world datasets demonstrate the effectiveness of our proposed model. Specifically, our model achieves around a 13% reduction in error metrics (MALE and RMSLE) and a notable 5.5% improvement in the ranking metric (NDCG) over the baseline methods.
Problem

Research questions and friction points this paper is trying to address.

Predicts citation impact for newborn papers lacking citation signals
Addresses bias in predicting citations for low-cited papers
Models implicit scientific impact factors beyond coarse proxies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent graph co-learning extracts interpretable signals
Two-stage forward process routes explicit exposure factors
GroupDRO optimizes worst-case risk across environments
M
Mingfei Lu
University of Technology Sydney, Sydney, Australia
Mengjia Wu
Mengjia Wu
University of Technology Sydney
BibliometricsText miningNetwork analytics
J
Jiawei Xu
University of Texas at Austin, Austin, United States
Weikai Li
Weikai Li
University of California, Los Angeles (UCLA)
Graph learningAI for EDAtransfer learning
F
Feng Liu
The University of Melbourne, Melbourne, Australia
Ying Ding
Ying Ding
Bill & Lewis Suit Professor, School of Information, Dell Med, University of Texas at Austin
AI in HealthKnowledge GraphScience of Science
Yizhou Sun
Yizhou Sun
Professor, Computer Science, UCLA
Information NetworksKnowledge GraphsGraph Neural NetworksData MiningMachine Learning
J
Jie Lu
University of Technology Sydney, Sydney, Australia
Y
Yi Zhang
University of Technology Sydney, Sydney, Australia