🤖 AI Summary
Conventional keyword-based representations in patent analysis suffer from coarse granularity and poor interpretability. Method: This paper proposes a hierarchical key-phrase profiling framework grounded in semantic calibration and prompt-driven learning. It introduces a hierarchical decoding mechanism that jointly models multi-level patent structures—including titles, abstracts, and claims—via integrated semantic calibration and prompt-based learning. Furthermore, it constructs a unified key-phrase profile that simultaneously captures both explicitly present and implicitly missing technical phrases. The approach synergistically integrates pretrained language models, hierarchical sequence generation, keyword extraction, and cross-level semantic alignment. Contribution/Results: Our method achieves significant improvements over state-of-the-art baselines across multiple patent benchmark datasets. Empirical evaluation on real-world scenarios demonstrates substantial gains in accuracy and interpretability for patent classification, cross-domain retrieval, and technology evolution analysis.
📝 Abstract
Patent analysis highly relies on concise and interpretable document representations, referred to as patent portraits. Keyphrases, both present and absent, are ideal candidates for patent portraits due to their brevity, representativeness, and clarity. In this paper, we introduce KAPPA, an integrated framework designed to construct keyphrase-based patent portraits and enhance patent analysis. KAPPA operates in two phases: patent portrait construction and portrait-based analysis. To ensure effective portrait construction, we propose a semantic-calibrated keyphrase generation paradigm that integrates pre-trained language models with a prompt-based hierarchical decoding strategy to leverage the multi-level structural characteristics of patents. For portrait-based analysis, we develop a comprehensive framework that employs keyphrase-based patent portraits to enable efficient and accurate patent analysis. Extensive experiments on benchmark datasets of keyphrase generation, the proposed model achieves significant improvements compared to state-of-the-art baselines. Further experiments conducted on real-world patent applications demonstrate that our keyphrase-based portraits effectively capture domain-specific knowledge and enrich semantic representation for patent analysis tasks.