LLM-Enhanced Semantic Data Integration of Electronic Component Qualifications in the Aerospace Domain

πŸ“… 2026-03-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

177K/year
πŸ€– AI Summary
This study addresses the challenge of fragmented qualification data for electronic components in aerospace engineering, which is scattered across multiple heterogeneous systems and impedes efficient decision-making during design phases. To overcome this, the authors propose a semantic integration approach that synergistically combines virtual knowledge graphs with large language models. By leveraging an ontology-based data access (OBDA) framework alongside vector retrieval mechanisms, the method enables unified and efficient querying of qualification information across disparate data silos. The approach maintains strong semantic consistency while substantially reducing manual data curation costs. Compared to conventional retrieval-augmented generation (RAG) or pure large-model solutions, it demonstrates superior performance in retrieval accuracy, computational efficiency, and long-term operational cost, thereby effectively minimizing redundant certification efforts.

Technology Category

Application Category

πŸ“ Abstract
Large manufacturing companies face challenges in information retrieval due to data silos maintained by different departments, leading to inconsistencies and misalignment across databases. This paper presents an experience in integrating and retrieving qualification data for electronic components used in satellite board design. Due to data silos, designers cannot immediately determine the qualification status of individual components. However, this process is critical during the planning phase, when assembly drawings are issued before production, to optimize new qualifications and avoid redundant efforts. To address this, we propose a pipeline that uses Virtual Knowledge Graphs for a unified view over heterogeneous data sources and LLMs to enhance retrieval and reduce manual effort in data cleansing. The retrieval of qualifications is then performed through an Ontology-based Data Access approach for structured queries and a vector search mechanism for retrieving qualifications based on similar textual properties. We perform a comparative cost-benefit analysis, demonstrating that the proposed pipeline also outperforms approaches relying solely on LLMs, such as Retrieval-Augmented Generation (RAG), in terms of long-term efficiency.
Problem

Research questions and friction points this paper is trying to address.

data silos
electronic component qualifications
information retrieval
aerospace domain
qualification status
Innovation

Methods, ideas, or system contributions that make the work stand out.

Virtual Knowledge Graphs
Large Language Models
Ontology-based Data Access
Semantic Data Integration
Vector Search
πŸ”Ž Similar Papers
No similar papers found.
πŸ’Ό Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
A
Antonio De Santis
Politecnico di Milano, DEIB, I-20133 Milano, Italy
M
Marco Balduini
Quantia Consulting, Milano, Italy
M
Matteo Belcao
Quantia Consulting, Milano, Italy
A
Andrea Proia
Thales Alenia Space, Roma, Italy
Marco Brambilla
Marco Brambilla
Politecnico di Milano
Data ScienceBig DataWeb Data ManagementModel-driven EngineeringKnowledge Extraction
Emanuele Della Valle
Emanuele Della Valle
Politecnico di Milano
semantic webstream processingdata streamsconcept driftbig data