🤖 AI Summary
This work addresses key challenges in natural language–controlled smart homes, including device failures, fragile integration, contextual ambiguity, and the absence of automatic recovery mechanisms. The authors propose a multi-agent orchestration architecture tailored for edge-based smart homes, deploying role-specialized, persistent large language model (LLM) agents on a home hub. These agents explicitly decouple planning, verification, authorization, and execution through MQTT-based communication, a Git-managed shared state, and lease-based root-authorized execution. By synergistically combining local edge hardware with cloud-based LLM inference, the system supports fuzzy-intent-driven collaborative operations, timeline-aware conflict resolution, and interception of expired or unauthorized commands. This study pioneers the integration of persistent, role-differentiated multi-agent systems into edge smart home environments, substantially enhancing robustness and interpretability.
📝 Abstract
Smart-home users increasingly want to control their homes in natural language rather than assemble rules, dashboards, and API integrations by hand. At the same time, real deployments are brittle: devices fail, integrations break, and recoveries often require manual intervention. Existing agent toolkits are effective for session-scoped delegation, but smart-home control operates under a different scenario: it is persistent, event-driven, failure-prone, and tied to physical devices with no shared context window. We present HearthNet, an edge multi-agent orchestration system for smart homes. HearthNet deploys a small set of persistent, role-specialized LLM agents at the home hub, where they coordinate through MQTT, Git-backed shared state, and root-issued actuation leases to govern heterogeneous devices through thin adapters. This design externalizes context, preserves execution history, and separates planning, verification, authorization, and actuation across explicit boundaries. Our current prototype runs on commodity edge hardware and Android devices; it keeps orchestration, state management, and device control on-premise while using hosted LLM APIs for inference. We demonstrate the system through three live scenarios: intent-driven multi-agent coordination from ambiguous natural language, conflict resolution with timeline-based tracing, and rejection of stale or unauthorized commands before device actuation.