🤖 AI Summary
Non-expert users struggle to obtain real-time, accurate, and context-aware insights from sensor-driven HVAC systems via natural language. Method: We propose a two-stage LLM-based QA framework featuring (1) an adaptive context injection mechanism that dynamically fuses streaming sensor data with domain knowledge; (2) a parameterized SQL generation and execution module ensuring robust, reliable data access; and (3) a bottom-up, multi-step reasoning planner to enhance consistency in complex queries. The framework is end-to-end integrated with LLMs, SQL generation, statistical analysis, and structured instruction translation. Contribution/Results: Evaluated on a real-world commercial HVAC dataset, our approach achieves state-of-the-art performance—improving response accuracy by +28.6% over baselines—and significantly enhances interpretability, as validated by both expert assessment and automated metrics (e.g., F1, BLEU-4).
📝 Abstract
Question-answering (QA) interfaces powered by large language models (LLMs) present a promising direction for improving interactivity with HVAC system insights, particularly for non-expert users. However, enabling accurate, real-time, and context-aware interactions with HVAC systems introduces unique challenges, including the integration of frequently updated sensor data, domain-specific knowledge grounding, and coherent multi-stage reasoning. In this paper, we present JARVIS, a two-stage LLM-based QA framework tailored for sensor data-driven HVAC system interaction. JARVIS employs an Expert-LLM to translate high-level user queries into structured execution instructions, and an Agent that performs SQL-based data retrieval, statistical processing, and final response generation. To address HVAC-specific challenges, JARVIS integrates (1) an adaptive context injection strategy for efficient HVAC and deployment-specific information integration, (2) a parameterized SQL builder and executor to improve data access reliability, and (3) a bottom-up planning scheme to ensure consistency across multi-stage response generation. We evaluate JARVIS using real-world data collected from a commercial HVAC system and a ground truth QA dataset curated by HVAC experts to demonstrate its effectiveness in delivering accurate and interpretable responses across diverse queries. Results show that JARVIS consistently outperforms baseline and ablation variants in both automated and user-centered assessments, achieving high response quality and accuracy.