Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Large language models (LLMs) suffer from inaccurate term boundary identification in domain-agnostic automatic term extraction (ATE). Method: This paper proposes a syntactically driven, retrieval-based few-shot prompting method. Unlike conventional semantic-similarity-based retrieval, it leverages dependency parse structures to construct a term-boundary-aware example retrieval mechanism, integrating retrieval-augmented generation (RAG) with syntactic parsing to enhance cross-domain generalization. Contribution/Results: It introduces explicit syntactic cues—such as morphological composition and syntactic boundaries—into LLM prompting for the first time. Evaluated on three domain-specific ATE benchmarks, the method achieves significant average F1-score improvements. Results demonstrate that syntactic priors critically enhance the robustness and accuracy of LLM-based term identification. The approach establishes a novel, interpretable, and transferable paradigm for domain-agnostic ATE.

Technology Category

Application Category

📝 Abstract

Automatic Term Extraction (ATE) identifies domain-specific expressions that are crucial for downstream tasks such as machine translation and information retrieval. Although large language models (LLMs) have significantly advanced various NLP tasks, their potential for ATE has scarcely been examined. We propose a retrieval-based prompting strategy that, in the few-shot setting, selects demonstrations according to emph{syntactic} rather than semantic similarity. This syntactic retrieval method is domain-agnostic and provides more reliable guidance for capturing term boundaries. We evaluate the approach in both in-domain and cross-domain settings, analyzing how lexical overlap between the query sentence and its retrieved examples affects performance. Experiments on three specialized ATE benchmarks show that syntactic retrieval improves F1-score. These findings highlight the importance of syntactic cues when adapting LLMs to terminology-extraction tasks.

Problem

Research questions and friction points this paper is trying to address.

Improving Automatic Term Extraction using syntactic retrieval with LLMs

Exploring LLMs' potential for domain-agnostic term boundary detection

Enhancing ATE performance via syntactic cues in few-shot settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-based prompting with syntactic similarity

Domain-agnostic syntactic retrieval method

Improves ATE performance via syntactic cues

🔎 Similar Papers

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval