Enhancing Cybersecurity in Critical Infrastructure with LLM-Assisted Explainable IoT Systems

📅 2025-03-05

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

To address the challenges of strong data heterogeneity and poor model interpretability in IoT systems deployed in critical infrastructure, this paper proposes an interpretable security analysis framework integrating autoencoders with large language models (LLMs). We introduce a novel LLM-driven dynamic feature engineering paradigm, wherein GPT-4 generates real-time feature selection, transformation, and encoding strategies, and produces natural-language explanations for anomaly attribution. The method jointly leverages autoencoder-based anomaly detection, LLM-assisted preprocessing, and PCA-based dimensionality reduction. Evaluated on KDDCup99, it achieves a macro-averaged F1-score of 0.98—representing a 99% improvement over the PCA-only baseline—and substantially overcomes the limitations of conventional black-box models. Our core contributions are threefold: (1) auditable detection processes, (2) human-understandable anomaly attributions, and (3) evolvable feature engineering—establishing a new paradigm for industrial-scale IoT security analysis that simultaneously delivers high accuracy and strong interpretability.

Technology Category

Application Category

📝 Abstract

Ensuring the security of critical infrastructure has become increasingly vital with the proliferation of Internet of Things (IoT) systems. However, the heterogeneous nature of IoT data and the lack of human-comprehensible insights from anomaly detection models remain significant challenges. This paper presents a hybrid framework that combines numerical anomaly detection using Autoencoders with Large Language Models (LLMs) for enhanced preprocessing and interpretability. Two preprocessing approaches are implemented: a traditional method utilizing Principal Component Analysis (PCA) to reduce dimensionality and an LLM-assisted method where GPT-4 dynamically recommends feature selection, transformation, and encoding strategies. Experimental results on the KDDCup99 10% corrected dataset demonstrate that the LLM-assisted preprocessing pipeline significantly improves anomaly detection performance. The macro-average F1 score increased from 0.49 in the traditional PCA-based approach to 0.98 with LLM-driven insights. Additionally, the LLM generates natural language explanations for detected anomalies, providing contextual insights into their causes and implications. This framework highlights the synergy between numerical AI models and LLMs, delivering an accurate, interpretable, and efficient solution for IoT cybersecurity in critical infrastructure.

Problem

Research questions and friction points this paper is trying to address.

Enhancing IoT cybersecurity in critical infrastructure using LLM-assisted systems.

Improving anomaly detection interpretability with LLM-driven preprocessing and explanations.

Combining Autoencoders and LLMs for better IoT data analysis and security.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Autoencoders with LLMs for anomaly detection

Uses GPT-4 for dynamic feature preprocessing

Generates natural language explanations for anomalies

🔎 Similar Papers

No similar papers found.