LISTEN: Lightweight Industrial Sound-representable Transformer for Edge Notification

📅 2025-07-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial acoustic analysis faces dual bottlenecks: deep learning models require extensive labeled data and struggle with real-time deployment on resource-constrained edge devices. To address this, we propose the first kilobyte-scale (KB-scale) lightweight industrial sound foundation model. Our method integrates knowledge distillation, extreme Transformer architecture compression, and co-optimization for edge computing—reducing model size to a few KB and inference latency to milliseconds, enabling end-to-end real-time execution on low-power embedded hardware. The model exhibits strong cross-task generalization: with only minimal labeled data, it achieves performance comparable to large teacher models on downstream tasks such as anomaly detection and predictive maintenance. Validated in real manufacturing workshops, it delivers high accuracy, ultra-low computational overhead, and plug-and-play deployability. This work establishes a practical, deployable paradigm for edge-intelligent auditory perception.

Technology Category

Application Category

📝 Abstract
Deep learning-based machine listening is broadening the scope of industrial acoustic analysis for applications like anomaly detection and predictive maintenance, thereby improving manufacturing efficiency and reliability. Nevertheless, its reliance on large, task-specific annotated datasets for every new task limits widespread implementation on shop floors. While emerging sound foundation models aim to alleviate data dependency, they are too large and computationally expensive, requiring cloud infrastructure or high-end hardware that is impractical for on-site, real-time deployment. We address this gap with LISTEN (Lightweight Industrial Sound-representable Transformer for Edge Notification), a kilobyte-sized industrial sound foundation model. Using knowledge distillation, LISTEN runs in real-time on low-cost edge devices. On benchmark downstream tasks, it performs nearly identically to its much larger parent model, even when fine-tuned with minimal datasets and training resource. Beyond the model itself, we demonstrate its real-world utility by integrating LISTEN into a complete machine monitoring framework on an edge device with an Industrial Internet of Things (IIoT) sensor and system, validating its performance and generalization capabilities on a live manufacturing shop floor.
Problem

Research questions and friction points this paper is trying to address.

Reducing reliance on large annotated datasets for industrial sound analysis
Overcoming computational limitations of large sound foundation models
Enabling real-time edge deployment for industrial acoustic monitoring
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight sound-representable transformer for edge
Knowledge distillation enables real-time edge deployment
Integrates with IIoT for live manufacturing monitoring
🔎 Similar Papers
No similar papers found.
C
Changheon Han
School of Mechanical Engineering, Purdue University, 585 Purdue Mall, West Lafayette 47907, USA
Y
Yun Seok Kang
Department of Mechanical Engineering, UNIST, Unist-gil 50, Eonyang-eup, Ulju-gun, Ulsan, Korea 44919
Y
Yuseop Sim
School of Mechanical Engineering, Purdue University, 585 Purdue Mall, West Lafayette 47907, USA
Martin Byung-Guk Jun
Martin Byung-Guk Jun
Professor, Purdue University
ManufacturingMicro machiningFemtosecond laser machiningElectrospinningNanoparticle coating
H
Hyung Wook Park
Department of Mechanical Engineering, UNIST, Unist-gil 50, Eonyang-eup, Ulju-gun, Ulsan, Korea 44919