🤖 AI Summary
Industrial non-intrusive load monitoring (NILM) faces challenges including scarcity of real-world data, complex appliance-level energy consumption patterns, and privacy constraints—severely limiting model generalizability. To address these, we propose Appliance-Modulated Data Augmentation (AMDA), a device-aware, controllable data augmentation framework built upon the open-source SIDED dataset generated via digital twin simulation. AMDA dynamically modulates augmentation intensity according to each appliance’s electricity consumption influence weight, preserving data fidelity while alleviating annotation scarcity and privacy bottlenecks. Experiments demonstrate that NILM models trained with AMDA achieve a normalized decomposition error of 0.093 on unseen scenarios—reducing errors by 80% and 68% compared to no augmentation and random augmentation baselines, respectively. This yields substantial improvements in cross-scenario generalization. Our work establishes a reproducible, scalable paradigm for data generation and augmentation tailored to industrial NILM.
📝 Abstract
Industrial Non-Intrusive Load Monitoring (NILM) is limited by the scarcity of high-quality datasets and the complex variability of industrial energy consumption patterns. To address data scarcity and privacy issues, we introduce the Synthetic Industrial Dataset for Energy Disaggregation (SIDED), an open-source dataset generated using Digital Twin simulations. SIDED includes three types of industrial facilities across three different geographic locations, capturing diverse appliance behaviors, weather conditions, and load profiles. We also propose the Appliance-Modulated Data Augmentation (AMDA) method, a computationally efficient technique that enhances NILM model generalization by intelligently scaling appliance power contributions based on their relative impact. We show in experiments that NILM models trained with AMDA-augmented data significantly improve the disaggregation of energy consumption of complex industrial appliances like combined heat and power systems. Specifically, in our out-of-sample scenarios, models trained with AMDA achieved a Normalized Disaggregation Error of 0.093, outperforming models trained without data augmentation (0.451) and those trained with random data augmentation (0.290). Data distribution analyses confirm that AMDA effectively aligns training and test data distributions, enhancing model generalization.