Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems

📅 2025-11-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Joint optimization of discrete control and continuous parameters in large-scale cyber-physical systems (CPS) remains challenging: hierarchical approaches sacrifice global optimality, while existing hybrid-action reinforcement learning (RL) methods rely on fragile reward shaping and struggle to enforce strict safety constraints. Method: We propose Logic-Guided Reinforcement Learning (LIRL), a novel RL framework that encodes dynamic safety constraints via first-order logic and enforces them rigorously through an implicit-space action projection mechanism—mapping low-dimensional latent actions onto a feasible hybrid-action manifold at each decision step, without reward penalties or action masking. Contribution/Results: LIRL guarantees zero constraint violations while enabling cross-domain transfer. Evaluated on real-world industrial assembly tasks, it simultaneously reduces energy consumption and makespan by 36.47%–44.33%, significantly outperforming both hierarchical optimization and state-of-the-art hybrid-action RL methods.

Technology Category

Application Category

📝 Abstract
Cyber-physical systems (CPS) require the joint optimization of discrete cyber actions and continuous physical parameters under stringent safety logic constraints. However, existing hierarchical approaches often compromise global optimality, whereas reinforcement learning (RL) in hybrid action spaces often relies on brittle reward penalties, masking, or shielding and struggles to guarantee constraint satisfaction. We present logic-informed reinforcement learning (LIRL), which equips standard policy-gradient algorithms with projection that maps a low-dimensional latent action onto the admissible hybrid manifold defined on-the-fly by first-order logic. This guarantees feasibility of every exploratory step without penalty tuning. Experimental evaluations have been conducted across multiple scenarios, including industrial manufacturing, electric vehicle charging stations, and traffic signal control, in all of which the proposed method outperforms existing hierarchical optimization approaches. Taking a robotic reducer assembly system in industrial manufacturing as an example, LIRL achieves a 36.47% to 44.33% reduction at most in the combined makespan-energy objective compared to conventional industrial hierarchical scheduling methods. Meanwhile, it consistently maintains zero constraint violations and significantly surpasses state-of-the-art hybrid-action reinforcement learning baselines. Thanks to its declarative logic-based constraint formulation, the framework can be seamlessly transferred to other domains such as smart transportation and smart grid, thereby paving the way for safe and real-time optimization in large-scale CPS.
Problem

Research questions and friction points this paper is trying to address.

Optimizes hybrid discrete-continuous cyber-physical system actions
Ensures strict safety logic constraint satisfaction automatically
Enables cross-domain optimization without manual penalty tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Logic-informed RL projects actions to feasible manifold
Uses first-order logic for on-the-fly constraint satisfaction
Declarative logic enables cross-domain transfer without penalty tuning
G
Guangxi Wan
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
Peng Zeng
Peng Zeng
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
X
Xiaoting Dong
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; University of Chinese Academy of Sciences, Beijing 100049, China
C
Chunhe Song
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
S
Shijie Cui
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
D
Dong Li
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
Q
Qingwei Dong
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
Yiyang Liu
Yiyang Liu
University of Missouri - Kansas City
NLPCVMultimodal
H
Hongfei Bai
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China