🤖 AI Summary
Existing content-based paper recommendation systems overlook users’ information-seeking behaviors—such as preferences for specific sections (e.g., methodology or results). To address this, we propose a behavior-aware content filtering framework that explicitly models users’ section-level attention preferences as learnable weights. Our method employs TF-IDF and cosine similarity to enable multi-granularity, section-wise matching between papers and user profiles, and generates personalized paper representations via weighted fusion of section-level similarities. Evaluated on the DBLP dataset, our approach significantly outperforms six state-of-the-art baseline models across five standard metrics: Precision, Recall, F1-score, Mean Reciprocal Rank (MRR), and Mean Average Precision (MAP). These results empirically validate that incorporating fine-grained behavioral modeling substantially enhances recommendation relevance and accuracy.
📝 Abstract
With the rapid growth of scientific publications, researchers need to spend more time and effort searching for papers that align with their research interests. To address this challenge, paper recommendation systems have been developed to help researchers in effectively identifying relevant paper. One of the leading approaches to paper recommendation is content-based filtering method. Traditional content-based filtering methods recommend relevant papers to users based on the overall similarity of papers. However, these approaches do not take into account the information seeking behaviors that users commonly employ when searching for literature. Such behaviors include not only evaluating the overall similarity among papers, but also focusing on specific sections, such as the method section, to ensure that the approach aligns with the user's interests. In this paper, we propose a content-based filtering recommendation method that takes this information seeking behavior into account. Specifically, in addition to considering the overall content of a paper, our approach also takes into account three specific sections (background, method, and results) and assigns weights to them to better reflect user preferences. We conduct offline evaluations on the publicly available DBLP dataset, and the results demonstrate that the proposed method outperforms six baseline methods in terms of precision, recall, F1-score, MRR, and MAP.