🤖 AI Summary
Automated analysis of programmable logic controller (PLC) binary programs faces significant challenges, including cross-platform format heterogeneity, entanglement of control logic with runtime code, and insufficient semantic representation. This work proposes PLC-BinX, the first approach to achieve function-level semantic recovery across four major PLC platforms. By integrating cross-platform reverse engineering, precise function boundary identification, and semantic feature extraction, PLC-BinX constructs a unified and interpretable function-level semantic representation framework that supports downstream learning tasks. Experimental results demonstrate that the method achieves 100% precision, recall, and F1 score in toolchain prediction and attains an F1 score of 49.18% across 22 functional classification tasks.
📝 Abstract
As emerging attacks increasingly target Industrial Control Systems (ICS), the security of Programmable Logic Controllers (PLCs) has become a critical concern. Binary Code Analysis (BCA), which enables analysts to understand compiled programs without source code, is essential for ICS security tasks such as post-attack digital forensics and incident response. However, automated BCA for PLC binaries remains challenging due to three key issues: heterogeneous binary formats across PLC platforms, entangled program semantics caused by the mixture of control logic with runtime code, and limited semantic representations for interpretable and learning-based downstream analysis. In this paper, we present PLC-BinX, a BCA workflow for cross-platform PLC binary understanding. PLC-BinX analyzes PLC binaries from four platforms: CODESYS v3, GEB, OpenPLC v2, and OpenPLC v3, and recovers function-level information through cross-platform reverse engineering, core-function extraction, and function-level semantic representation construction. Based on the recovered semantic representations, we further study two downstream tasks: toolchain prediction and functionality prediction. Under ten-fold program-level evaluation, PLC-BinX achieves 100.00% precision, recall, and F1 in toolchain prediction, and 51.43% precision, 49.38% recall, and 49.18% F1 in functionality prediction over 22 labels. The results demonstrate that PLC-BinX provides an effective and interpretable approach to cross-platform PLC binary understanding by exposing task-relevant function-level semantics from heterogeneous PLC binaries.