🤖 AI Summary
To address the challenges of strong data heterogeneity and poor model interpretability in IoT systems deployed in critical infrastructure, this paper proposes an interpretable security analysis framework integrating autoencoders with large language models (LLMs). We introduce a novel LLM-driven dynamic feature engineering paradigm, wherein GPT-4 generates real-time feature selection, transformation, and encoding strategies, and produces natural-language explanations for anomaly attribution. The method jointly leverages autoencoder-based anomaly detection, LLM-assisted preprocessing, and PCA-based dimensionality reduction. Evaluated on KDDCup99, it achieves a macro-averaged F1-score of 0.98—representing a 99% improvement over the PCA-only baseline—and substantially overcomes the limitations of conventional black-box models. Our core contributions are threefold: (1) auditable detection processes, (2) human-understandable anomaly attributions, and (3) evolvable feature engineering—establishing a new paradigm for industrial-scale IoT security analysis that simultaneously delivers high accuracy and strong interpretability.
📝 Abstract
Ensuring the security of critical infrastructure has become increasingly vital with the proliferation of Internet of Things (IoT) systems. However, the heterogeneous nature of IoT data and the lack of human-comprehensible insights from anomaly detection models remain significant challenges. This paper presents a hybrid framework that combines numerical anomaly detection using Autoencoders with Large Language Models (LLMs) for enhanced preprocessing and interpretability. Two preprocessing approaches are implemented: a traditional method utilizing Principal Component Analysis (PCA) to reduce dimensionality and an LLM-assisted method where GPT-4 dynamically recommends feature selection, transformation, and encoding strategies. Experimental results on the KDDCup99 10% corrected dataset demonstrate that the LLM-assisted preprocessing pipeline significantly improves anomaly detection performance. The macro-average F1 score increased from 0.49 in the traditional PCA-based approach to 0.98 with LLM-driven insights. Additionally, the LLM generates natural language explanations for detected anomalies, providing contextual insights into their causes and implications. This framework highlights the synergy between numerical AI models and LLMs, delivering an accurate, interpretable, and efficient solution for IoT cybersecurity in critical infrastructure.