Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This study addresses the critical need for intrusion detection methods in industrial control systems that simultaneously achieve high accuracy and auditability. The work proposes a novel approach that leverages large language models without fine-tuning for ICS intrusion detection. By discretizing protocol fields into compact token sequences and employing prompt engineering, the model performs binary classification of Modbus traffic as normal or critical while generating verifiable attribution records based on salient tokens. Evaluated on two public datasets, the method attains performance comparable to strong supervised baselines, with the generated audit trails demonstrating high relevance to model decisions. This effectively balances detection efficacy with interpretability, offering a promising direction for trustworthy security monitoring in energy infrastructure.

📝 Abstract

In modern energy systems, industrial control systems (ICS) and power-system SCADA require intrusion detection that is not only accurate but also auditable by operators. The ICS intrusion-detection landscape is currently dominated by established supervised detectors. In this paper, we study whether an off-the-shelf large language model (LLM) can serve as a complementary, human-in-the-loop layer for Modbus traffic. We cast this as a binary network-side normal/critical decision task on two public ICS Modbus datasets, collapsing attack periods and other safety-critical behaviors into a single critical class. Each Modbus communication instance is converted into a compact token string derived from discretized protocol fields, and a prompt-configured LLM produces a normal/critical alert together with a concise, token-grounded incident record for analyst review. Under matched event information and shared evaluation splits, the resulting LLM-based triage pipeline achieves high predictive performance on both benchmarks and is broadly comparable to strong supervised baselines, while requiring no task-specific weight updates. To assess the audit record, we apply intervention-based diagnostics, including sufficiency- and necessity-style tests, which provide evidence that the cited tokens are often decision-relevant to the model's own prediction. These records are intended as audit signals rather than full human-grounded explanations.

Problem

Research questions and friction points this paper is trying to address.

Industrial Control Systems

Intrusion Detection

Explainability

Modbus

Auditability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Explainable AI

Industrial Control Systems