NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research

πŸ“… 2025-02-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address low retrieval efficiency and incomplete coverage in nanotechnology literature search, this paper proposes a domain-specific Retrieval-Augmented Generation (RAG) system. Methodologically, it introduces a novel dynamic crawling backend that integrates Google Scholar’s advanced search with open-access papers from Elsevier, Springer, and ACS platforms; further incorporating intelligent query understanding, cross-source deduplication, semantic ranking, and large language model (LLM)-based generation to form an end-to-end, domain-tailored RAG pipeline. Evaluated on real-world nanotechnology queries, the system achieves a 37% higher accuracy and 5.2Γ— faster response time compared to mainstream public LLMs, substantially reducing literature review turnaround. This work establishes a reusable technical paradigm and empirical foundation for vertical RAG system design in specialized scientific domains.

Technology Category

Application Category

πŸ“ Abstract
This paper presents the development and application of a Large Language Model Retrieval-Augmented Generation (LLM-RAG) system tailored for nanotechnology research. The system leverages the capabilities of a sophisticated language model to serve as an intelligent research assistant, enhancing the efficiency and comprehensiveness of literature reviews in the nanotechnology domain. Central to this LLM-RAG system is its advanced query backend retrieval mechanism, which integrates data from multiple reputable sources. The system retrieves relevant literature by utilizing Google Scholar's advanced search, and scraping open-access papers from Elsevier, Springer Nature, and ACS Publications. This multifaceted approach ensures a broad and diverse collection of up-to-date scholarly articles and papers. The proposed system demonstrates significant potential in aiding researchers by providing a streamlined, accurate, and exhaustive literature retrieval process, thereby accelerating research advancements in nanotechnology. The effectiveness of the LLM-RAG system is validated through rigorous testing, illustrating its capability to significantly reduce the time and effort required for comprehensive literature reviews, while maintaining high accuracy, query relevance and outperforming standard, publicly available LLMS.
Problem

Research questions and friction points this paper is trying to address.

Develops a retrieval-augmented LLM for nanotechnology research.
Enhances literature review efficiency using multi-source data integration.
Reduces time and effort in comprehensive nanotechnology literature reviews.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-RAG system for nanotechnology research
Integrates data from multiple reputable sources
Utilizes Google Scholar and open-access scraping
πŸ”Ž Similar Papers
No similar papers found.
Achuth Chandrasekhar
Achuth Chandrasekhar
Graduate Student, Carnegie Mellon University
Additive ManufacturingDeep Learning
Omid Barati Farimani
Omid Barati Farimani
PhD Student
Machine LearningDeep LearningLLMMolecular Dynamics
O
Olabode Ajenifujah
Mechanical Engineering, Carnegie Mellon University, Pittsburgh, 15213, PA, USA
Janghoon Ock
Janghoon Ock
Assistant Professor, University of Nebraska-Lincoln
Computational CatalysisMaterial DiscoveryAI4Science
A
A. Farimani
Mechanical Engineering, Biomedical Engineering, Chemical Engineering, Machine Learning Department, Carnegie Mellon University, Pittsburgh, 15213, PA, USA