Emission-GPT: A domain-specific language model agent for knowledge retrieval, emission inventory and data analysis

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Air pollution and climate change research are hindered by fragmented emission knowledge, inefficient data access, and barriers to domain understanding for non-experts. To address these challenges, we propose a knowledge-enhanced large language model (LLM) agent tailored for atmospheric emissions. The agent integrates a structured knowledge base comprising over 10,000 domain-specific publications, a curated emissions knowledge graph, modular prompt engineering, and question-completion techniques. It enables natural-language-driven emission data retrieval, inventory analysis, source contribution attribution, and scenario factor recommendation. Our architecture achieves end-to-end automation of emission analysis: in a Guangdong Province case study, it accurately identified point-source spatial distributions and sectoral trends while substantially improving data processing efficiency. The framework establishes a scalable, user-friendly paradigm for intelligent emission inventory compilation and policy-relevant scenario assessment.

Technology Category

Application Category

📝 Abstract
Improving air quality and addressing climate change relies on accurate understanding and analysis of air pollutant and greenhouse gas emissions. However, emission-related knowledge is often fragmented and highly specialized, while existing methods for accessing and compiling emissions data remain inefficient. These issues hinder the ability of non-experts to interpret emissions information, posing challenges to research and management. To address this, we present Emission-GPT, a knowledge-enhanced large language model agent tailored for the atmospheric emissions domain. Built on a curated knowledge base of over 10,000 documents (including standards, reports, guidebooks, and peer-reviewed literature), Emission-GPT integrates prompt engineering and question completion to support accurate domain-specific question answering. Emission-GPT also enables users to interactively analyze emissions data via natural language, such as querying and visualizing inventories, analyzing source contributions, and recommending emission factors for user-defined scenarios. A case study in Guangdong Province demonstrates that Emission-GPT can extract key insights--such as point source distributions and sectoral trends--directly from raw data with simple prompts. Its modular and extensible architecture facilitates automation of traditionally manual workflows, positioning Emission-GPT as a foundational tool for next-generation emission inventory development and scenario-based assessment.
Problem

Research questions and friction points this paper is trying to address.

Addresses fragmented specialized emission knowledge access
Enables interactive natural language emission data analysis
Automates manual emission inventory workflows using AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-specific language model for emission knowledge retrieval
Integrates prompt engineering with curated document knowledge base
Enables interactive data analysis via natural language queries
J
Jiashu Ye
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
T
Tong Wu
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
W
Weiwen Chen
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
H
Hao Zhang
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
Z
Zeteng Lin
Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
Xingxing Li
Xingxing Li
GFZ
GPSGNSS precise positioning and orbit determinationGNSS data processingGNSS seismologyGNSS meteorology
S
Shujuan Weng
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
M
Manni Zhu
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
X
Xin Yuan
College of Environment and Climate, Institute for Environment and Climate Research, Jinan University, Guangzhou 511443, China
X
Xinlong Hong
College of Environment and Climate, Institute for Environment and Climate Research, Jinan University, Guangzhou 511443, China
Jingjie Li
Jingjie Li
University of Edinburgh
Usable Security and PrivacyHuman-Centered ComputingMixed RealityInternet of Things
J
Junyu Zheng
Sustainable Energy and Environment Thrust, Function Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China
Z
Zhijiong Huang
College of Environment and Climate, Institute for Environment and Climate Research, Jinan University, Guangzhou 511443, China
J
Jing Tang
Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511400, China