π€ AI Summary
This study addresses the lexical mismatch between natural language queries and structured descriptions of electronic components by proposing a large language model (LLM)-assisted dense retrieval and re-ranking approach that integrates hierarchical semantic information from the ECLASS standard. For the first time, the hierarchical ontology of ECLASS is embedded into the retrieval framework to bridge the semantic gap between user intent and sparse product descriptions. Experimental results demonstrate that the proposed method achieves a Hit@5 score of 94.3% on expert queries, substantially outperforming both BM25 (31.4%) and baseline LLM-based web search approaches. The method delivers significant improvements in both retrieval accuracy and efficiency, highlighting the effectiveness of leveraging standardized ontological structures to enhance semantic alignment in technical domains.
π Abstract
Efficient semantic access to industrial product data is a key enabler for factory automation and emerging LLM-based agent workflows, where both human engineers and autonomous agents must identify suitable components from highly structured catalogs. However, the vocabulary mismatch between natural-language queries and attribute-centric product descriptions limits the effectiveness of traditional retrieval approaches, e.g., BM25. In this work, we present a systematic evaluation of LLM-assisted dense retrieval for semantic product search on industrial electronic components, and investigate the integration of hierarchical semantics from the ECLASS standard into embedding-based retrieval. Our results show that dense retrieval combined with re-ranking substantially outperforms classical lexical methods and foundation model web-search baselines. In particular, the proposed approach achieves a Hit_Rate@5 of 94.3 %, compared to 31.4 % for BM25 on expert queries, while also exceeding foundation model baselines in both effectiveness and efficiency. Furthermore, augmenting product representations with ECLASS semantics yields consistent performance gains across configurations, demonstrating that standardized hierarchical metadata provides a crucial semantic bridge between user intent and sparse product descriptions.