Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems

📅 2025-11-02

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Joint optimization of discrete control and continuous parameters in large-scale cyber-physical systems (CPS) remains challenging: hierarchical approaches sacrifice global optimality, while existing hybrid-action reinforcement learning (RL) methods rely on fragile reward shaping and struggle to enforce strict safety constraints. Method: We propose Logic-Guided Reinforcement Learning (LIRL), a novel RL framework that encodes dynamic safety constraints via first-order logic and enforces them rigorously through an implicit-space action projection mechanism—mapping low-dimensional latent actions onto a feasible hybrid-action manifold at each decision step, without reward penalties or action masking. Contribution/Results: LIRL guarantees zero constraint violations while enabling cross-domain transfer. Evaluated on real-world industrial assembly tasks, it simultaneously reduces energy consumption and makespan by 36.47%–44.33%, significantly outperforming both hierarchical optimization and state-of-the-art hybrid-action RL methods.

Technology Category

Application Category

📝 Abstract

Cyber-physical systems (CPS) require the joint optimization of discrete cyber actions and continuous physical parameters under stringent safety logic constraints. However, existing hierarchical approaches often compromise global optimality, whereas reinforcement learning (RL) in hybrid action spaces often relies on brittle reward penalties, masking, or shielding and struggles to guarantee constraint satisfaction. We present logic-informed reinforcement learning (LIRL), which equips standard policy-gradient algorithms with projection that maps a low-dimensional latent action onto the admissible hybrid manifold defined on-the-fly by first-order logic. This guarantees feasibility of every exploratory step without penalty tuning. Experimental evaluations have been conducted across multiple scenarios, including industrial manufacturing, electric vehicle charging stations, and traffic signal control, in all of which the proposed method outperforms existing hierarchical optimization approaches. Taking a robotic reducer assembly system in industrial manufacturing as an example, LIRL achieves a 36.47% to 44.33% reduction at most in the combined makespan-energy objective compared to conventional industrial hierarchical scheduling methods. Meanwhile, it consistently maintains zero constraint violations and significantly surpasses state-of-the-art hybrid-action reinforcement learning baselines. Thanks to its declarative logic-based constraint formulation, the framework can be seamlessly transferred to other domains such as smart transportation and smart grid, thereby paving the way for safe and real-time optimization in large-scale CPS.

Problem

Research questions and friction points this paper is trying to address.

Optimizes hybrid discrete-continuous cyber-physical system actions

Ensures strict safety logic constraint satisfaction automatically

Enables cross-domain optimization without manual penalty tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Logic-informed RL projects actions to feasible manifold

Uses first-order logic for on-the-fly constraint satisfaction

Declarative logic enables cross-domain transfer without penalty tuning

🔎 Similar Papers

CommonPower: A Framework for Safe Data-Driven Smart Grid Control