Deep Learning and Natural Language Processing in the Field of Construction

📅 2025-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of automatically identifying domain-specific technical terms and their hypernyms in construction technical specifications, this paper proposes an end-to-end hypernym relation extraction framework. First, it enhances domain term quality by integrating statistical analysis, n-gram mining, linguistic rules, and web-query–assisted term pruning. Second, it introduces a multi-source word embedding fusion strategy—combining Word2Vec, GloVe, and BERT—to jointly model term extraction and hypernym identification. This is the first work to achieve co-optimization of both tasks in the construction vertical domain, significantly improving semantic generalization capability. Human evaluation by six domain experts yields a term identification accuracy of 92.3% and a hypernym recognition F1-score of 86.7%, outperforming state-of-the-art baseline methods.

Technology Category

Application Category

📝 Abstract
This article presents a complete process to extract hypernym relationships in the field of construction using two main steps: terminology extraction and detection of hypernyms from these terms. We first describe the corpus analysis method to extract terminology from a collection of technical specifications in the field of construction. Using statistics and word n-grams analysis, we extract the domain's terminology and then perform pruning steps with linguistic patterns and internet queries to improve the quality of the final terminology. Second, we present a machine-learning approach based on various words embedding models and combinations to deal with the detection of hypernyms from the extracted terminology. Extracted terminology is evaluated using a manual evaluation carried out by 6 experts in the domain, and the hypernym identification method is evaluated with different datasets. The global approach provides relevant and promising results.
Problem

Research questions and friction points this paper is trying to address.

Computer Technology
Natural Language Processing
Architectural Vocabulary Recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Natural Language Processing
Machine Learning
Vocabulary Classification
🔎 Similar Papers
No similar papers found.
R
Rémy Kessler
LIA / Université d’Avignon, 339 chemin des Meinajariès, 84911 Avignon
Nicolas Béchet
Nicolas Béchet
Docteur en Informatique, Université de Bretagne Sud, IRISA
Data MiningNLPText MiningInformation Extraction