ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing unsupervised keyphrase prediction methods rely on heuristic importance scoring, leading to substantial bias in informativeness estimation and low inference efficiency. To address these limitations, we propose a two-module framework integrating reference alignment and term-level modeling: (1) a novel reference-driven term-level informativeness model that jointly encodes the query, cited context, and title as unified reference signals—eliminating the need for explicit candidate phrase generation; and (2) a dynamic dual-mode architecture supporting both extraction and generation. Our approach leverages pretrained language models to instantiate a term-level evaluator and a phrase generator, augmented with a lightweight reference alignment mechanism. Extensive experiments demonstrate state-of-the-art performance across multiple benchmarks: our method achieves 89% of the Top-10 recall attained by supervised models, significantly improves query/document expansion in retrieval tasks, and delivers the fastest inference speed among models of comparable scale.

Technology Category

Application Category

📝 Abstract
Unsupervised keyphrase prediction has gained growing interest in recent years. However, existing methods typically rely on heuristically defined importance scores, which may lead to inaccurate informativeness estimation. In addition, they lack consideration for time efficiency. To solve these problems, we propose ERU-KG, an unsupervised keyphrase generation (UKG) model that consists of an informativeness and a phraseness module. The former estimates the relevance of keyphrase candidates, while the latter generate those candidates. The informativeness module innovates by learning to model informativeness through references (e.g., queries, citation contexts, and titles) and at the term-level, thereby 1) capturing how the key concepts of documents are perceived in different contexts and 2) estimating informativeness of phrases more efficiently by aggregating term informativeness, removing the need for explicit modeling of the candidates. ERU-KG demonstrates its effectiveness on keyphrase generation benchmarks by outperforming unsupervised baselines and achieving on average 89% of the performance of a supervised model for top 10 predictions. Additionally, to highlight its practical utility, we evaluate the model on text retrieval tasks and show that keyphrases generated by ERU-KG are effective when employed as query and document expansions. Furthermore, inference speed tests reveal that ERU-KG is the fastest among baselines of similar model sizes. Finally, our proposed model can switch between keyphrase generation and extraction by adjusting hyperparameters, catering to diverse application requirements.
Problem

Research questions and friction points this paper is trying to address.

Improves unsupervised keyphrase prediction accuracy by avoiding heuristic scores
Enhances time efficiency in keyphrase generation through term-level modeling
Enables flexible switching between keyphrase generation and extraction modes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reference-aligned term-level informativeness modeling
Combines informativeness and phraseness modules efficiently
Enables switchable generation/extraction via hyperparameters
🔎 Similar Papers
No similar papers found.