🤖 AI Summary
This work addresses the limitations of existing citation recommendation systems, which often overlook fine-grained patterns in human citing behavior and employ evaluation protocols misaligned with real-world usage. To bridge this gap, the authors propose DAVINCI, a novel framework featuring three key components: a lightweight, learning-free Profiler module that efficiently captures unbiased citation behavior; an adaptive vector gating mechanism that dynamically integrates semantic relevance with citation confidence; and an inductive evaluation paradigm constrained by temporal ordering to better emulate practical recommendation scenarios. Extensive experiments demonstrate that DAVINCI achieves state-of-the-art performance across multiple benchmark datasets while maintaining high computational efficiency and strong generalization capabilities.
📝 Abstract
Proper citation of relevant literature is essential for contextualising and validating scientific contributions. While current citation recommendation systems leverage local and global textual information, they often overlook the nuances of the human citation behaviour. Recent methods that incorporate such patterns improve performance but incur high computational costs and introduce systematic biases into downstream rerankers. To address this, we propose Profiler, a lightweight, non-learnable module that captures human citation patterns efficiently and without bias, significantly enhancing candidate retrieval. Furthermore, we identify a critical limitation in current evaluation protocol: the systems are assessed in a transductive setting, which fails to reflect real-world scenarios. We introduce a rigorous Inductive evaluation setting that enforces strict temporal constraints, simulating the recommendation of citations for newly authored papers in the wild. Finally, we present DAVINCI, a novel reranking model that integrates profiler-derived confidence priors with semantic information via an adaptive vector-gating mechanism. Our system achieves new state-of-the-art results across multiple benchmark datasets, demonstrating superior efficiency and generalisability.