A Needle in a Haystack: Intent-driven Reusable Artifacts Recommendation with LLMs

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

In open-source development, developers struggle to accurately identify functionally suitable reusable components from vast repositories. To address this, we propose TreeRec, an intent-driven hierarchical recommendation framework that introduces feature trees into large language model (LLM)-based recommendation for the first time. TreeRec constructs a semantically abstract, ontology-inspired hierarchical feature tree and employs an intent–function alignment mechanism to enable fine-grained requirement understanding and multi-level semantic reasoning. This design substantially reduces LLM inference overhead while improving semantic matching accuracy. Extensive experiments across three major open-source ecosystems demonstrate that TreeRec consistently enhances recommendation precision—averaging significant gains across multiple mainstream LLMs—while reducing inference latency. Moreover, it exhibits strong generalizability and practical feasibility for real-world deployment.

Technology Category

Application Category

📝 Abstract

In open source software development, the reuse of existing artifacts has been widely adopted to avoid redundant implementation work. Reusable artifacts are considered more efficient and reliable than developing software components from scratch. However, when faced with a large number of reusable artifacts, developers often struggle to find artifacts that can meet their expected needs. To reduce this burden, retrieval-based and learning-based techniques have been proposed to automate artifact recommendations. Recently, Large Language Models (LLMs) have shown the potential to understand intentions, perform semantic alignment, and recommend usable artifacts. Nevertheless, their effectiveness has not been thoroughly explored. To fill this gap, we construct an intent-driven artifact recommendation benchmark named IntentRecBench, covering three representative open source ecosystems. Using IntentRecBench, we conduct a comprehensive comparative study of five popular LLMs and six traditional approaches in terms of precision and efficiency. Our results show that although LLMs outperform traditional methods, they still suffer from low precision and high inference cost due to the large candidate space. Inspired by the ontology-based semantic organization in software engineering, we propose TreeRec, a feature tree-guided recommendation framework to mitigate these issues. TreeRec leverages LLM-based semantic abstraction to organize artifacts into a hierarchical semantic tree, enabling intent and function alignment and reducing reasoning time. Extensive experiments demonstrate that TreeRec consistently improves the performance of diverse LLMs across ecosystems, highlighting its generalizability and potential for practical deployment.

Problem

Research questions and friction points this paper is trying to address.

Finding reusable software artifacts that match developer intentions from large collections

Addressing low precision and high computational costs in LLM-based artifact recommendations

Improving intent-function alignment and reasoning efficiency in artifact recommendation systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based semantic abstraction organizes artifacts hierarchically

TreeRec framework reduces reasoning time and improves precision

Feature tree-guided recommendation aligns intents with functions

🔎 Similar Papers

No similar papers found.